Whitepaper – Streaming Hadoop Solutions

In this whitepaper, I take a look at the various options for Hadoop Streaming.  These include Apache Storm, Apache Spark Streaming and Apache Samza.  Also I examine commercial alternatives, such as Data Torrent.  I cover implementation details of streaming, including type of streaming and capacities of libraries and products included.

You can read this whitepaper online or download it via the included Slideshare link.

Happy streaming!

New YouTube Series – Hadoop MapReduce Fundamentals

Hadoop MapReduce

Hadoop MapReduce

I’ve been working with Hadoop MapReduce in several formats over the past couple of years.  I decided to pull together my experience and record this as a free, multi-part screencast series on YouTube.

The course consists of 5 screencasts – from 30 – 50 minutes per part.  Each part tackles some aspect of Hadoop MapReduce, from basic, conceptual understanding to most common tuning processes.  Throughout the series, I’ve included screencast demos using a variety of vendor distributions of Hadoop.  These demos include Cloudera CHD4, Windows Azure HDInsight, AWS MapReduce and more.

Below is the first module of the course.

Here is a link to the entire Power Point deck.

Here is a link to the course demo files.