Here’s the .pdf of my full-day workshop from QCon London ‘Beyond Relational: Cloud Big Data Design Patterns’
#HappyBuilding
Here’s the .pdf of my full-day workshop from QCon London ‘Beyond Relational: Cloud Big Data Design Patterns’
#HappyBuilding
Here’s a link to my slides from the workshop I delivered for QCon Sao Paulo, Brazil, “Real-world Cloud Big Data Patterns”
Enjoy!
In this whitepaper, I take a look at the various options for Hadoop Streaming. These include Apache Storm, Apache Spark Streaming and Apache Samza. Also I examine commercial alternatives, such as Data Torrent. I cover implementation details of streaming, including type of streaming and capacities of libraries and products included.
You can read this whitepaper online or download it via the included Slideshare link.
Happy streaming!
Here’s my updated deck from SVCC code camp.
I’ve been working with Hadoop MapReduce in several formats over the past couple of years. I decided to pull together my experience and record this as a free, multi-part screencast series on YouTube.
The course consists of 5 screencasts – from 30 – 50 minutes per part. Each part tackles some aspect of Hadoop MapReduce, from basic, conceptual understanding to most common tuning processes. Throughout the series, I’ve included screencast demos using a variety of vendor distributions of Hadoop. These demos include Cloudera CHD4, Windows Azure HDInsight, AWS MapReduce and more.
Below is the first module of the course.
Here is a link to the entire Power Point deck.
Here is a link to the course demo files.
Here’s the deck from the presentation named ‘Microsoft’s Big Data Story’ that I’ll be giving on April 8-10 at the Big Data Tech Con in Boston – enjoy!
Here’s my updated deck for the upcoming SDC 2013 conference in Sweden. I’ve updated the core deck to include information about AWS RedShift and Cloudera Impala.
If you want to see demos from this deck (and more on NoSQL and BigData), check out my YouTube BigData channel.
I tried out creating a data pipeline (ETL process) on the AWS cloud this morning. This currently works with AWS data sources, such as S3, DynamoDB and RDS.
I found that I did need to read the AWS documentation in order to create even a simple pipeline. Below is an example of a simple copy job in the data pipeline designer.
Enjoy the screencast
Follow me on my planned learning adventures in 2013! Also, I’d love to hear what (and how) you plan to learn more about ‘all things data’ in 2013 – comment on this blog post with your study tips.
Updated deck for my talk here today – enjoy!