Tag «ApacheHadoop»

BigData Investigation 10 – Using Hadoop Streaming on Hadoop Cluster in Pseudo-Distributed Mode

In this post I will explain how to run the Hadoop Streaming utility on a Hadoop Cluster in Pseudo-Distributed Mode. Hadoop Streaming uses executables or scripts to create a MapReduce job and submits the job to a Hadoop cluster. In an earlier post I have explained how to run Hadoop Streaming in Standalone (Local) Mode. …

BigData Investigation 9 – Installing Apache Hadoop in Pseudo-Distributed Mode

In this post I will explain how to configure Apache Hadoop in Pseudo-Distributed Mode. In an earlier post I have explained how to install Apache Hadoop in Local (Standalone) Mode. Now I will apply the required configuration changes to turn that cluster into Pseudo-Distributed Mode. Step 1 – Install Apache Hadoop in Local (Standalone) Mode: …

BigData Investigation 8 – Using Hadoop Streaming on Hadoop Cluster in Local (Standalone) Mode

In this post I will explain how to run the Hadoop Streaming utility on a Hadoop Cluster in Local (Standalone) Mode. Hadoop Streaming uses executables or scripts to create a MapReduce job and submits the job to a Hadoop cluster.  In an earlier post I have explained how to download and install Apache Hadoop in …

BigData Investigation 7 – Installing Apache Hadoop in Local (Standalone) Mode

In this post I will explain how to download Apache Hadoop and install it on CentOS 7 Linux in Local (Standalone) Mode. In earlier posts I have used the Cloudera Quickstart VM to describe how to create MapReduce applications with Python and Hadoop Streaming. Using pre-configured Hadoop clusters like the Cloudera Quickstart VM is convenient …