Whitepaper – Streaming Hadoop Solutions

In this whitepaper, I take a look at the various options for Hadoop Streaming.  These include Apache Storm, Apache Spark Streaming and Apache Samza.  Also I examine commercial alternatives, such as Data Torrent.  I cover implementation details of streaming, including type of streaming and capacities of libraries and products included. You can read this whitepaper […]

Whitepaper – Practical Machine Learning

Here’s a whitepaper I wrote on the ‘state of Machine Learning’.  It includes information about implementation via various cloud-based ML services (AWS, Azure, IBM) as well as category information (for architects).  Your are welcome to read this whitepaper online or to download it if you prefer (linked to Slideshare source). Enjoy!

How to: Developing for Aerospike with Python or C#

I’ve been doing some work with the super fast in-memory database,  Aerospike lately.  See previous blog posts here about the speed of this product.  Since I’ve started work w/Aerospike, the team there has announced that their core product is now open source. In this blog post, I’ll be covering how to get started developing with […]

How to: Installing AerospikeDB on Google Compute Engine

Recently, I’ve been doing some work with AerospikeDB.  It is a super-fast in-memory NoSQL Database.  I gave a presentation at the recent BigDataCampLA on ‘Bleeding Edge Databases’ and included it because of impressive benchmarks, such as 1 Million TPS (read-only workload) PER SERVER and 40K TPS (read-write) on that same server.  Here’s the live presentation, […]