Yesterday I was pulling down some stock data from Yahoo, with the goal of building out a machine learning training set using Spark and Cassandra. If you haven’t tried Cassandra yet, it’s a database built for high availability and linear scalability. I’ve got a intro talk up here. Spark is another apache project that kicks Cassandra into overdrive by providing a framework for batch analytics, streaming, and machine learning. On the way is support for graph operations which makes me giddy.
Hdf5
- I’m trying to evaluate pytables as a replacement for very large Python dictionaries, but having some issues getting HDF5 installed on my Mac (OS X Snow Leopard). I’ve been getting this error: configure: error: C compiler cannot create executables I haven’t been able to figure out what’s wrong yet - anyone have any ideas? I’ve got XCode Tools installed, I’ve compiled Apache, PHP and Memcached without issue (prior to Snow Leopard Update).