Tag «MapReduce»

BigData Investigation 5 – MapReduce with Python and Hadoop Streaming

In this post I will explain the Hadoop Streaming utility. Hadoop Streaming uses executables or scripts to create a MapReduce job and submits the job to a Hadoop cluster. Hadoop’s programming model is called MapReduce. In a previous post I have explained MapReduce using a Unix pipe which includes two Python scripts and a few …

BigData Investigation 4 – MapReduce Explained

In this post I will explain MapReduce. MapReduce is Hadoop’s programming model to analyze data. I use the Hadoop Book for my investigation on BigData.  MapReduce is covered in chapter 2. Let’s study the examples to understand MapReduce. All code examples of the Hadoop Book are available at GitHub. First we need to copy the example data …