Saturday, 3 May 2014

Hadoop Installation Using Cloudera Package - Pseudo Distributed Mode (Single Node)

[Previous Post]

Hadoop can be installed using cloudera also with less steps in an easy way .The difference is Cloudera packed Apache Hadoop and some ecosystem projects into one package.And they have set all the configuration to localhost and we need not want to set the configuration files.

Installation using Cloudera Package.

Prerequistie

1.Java


Installation Steps

Step 1: Set Java home in /etc/profile

unmesha@unmesha-hadoop-virtual-machine:~$ java -version
java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)

Check your current location of java 

unmesha@unmesha-hadoop-virtual-machine:~$ sudo update-alternatives --config java
[sudo] password for unmesha: 
There is only one alternative in link group java: /usr/lib/jvm/java-7-oracle/jre/bin/java
Nothing to configure.

Set JAVA_HOME

export JAVA_HOME=/usr/lib/jvm/java-7-oracle
unmesha@unmesha-hadoop-virtual-machine:~$ source ~/.bashrc 

Step 2: Download the package for your system under "On Ubuntu and other Debian systems, do the following:" heading from here.

Step 3: Extract the package

unmesha@unmesha-hadoop-virtual-machine:~$sudo dpkg -i cdh4-repository_1.0_all.deb


Step 4: Install Hadoop

unmesha@unmesha-hadoop-virtual-machine:~$sudo apt-get update 
unmesha@unmesha-hadoop-virtual-machine:~$sudo apt-get install hadoop-0.20-conf-pseudo


Step 5: Format Namenode

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hdfs namenode -format


Step 6: Start HDFS

unmesha@unmesha-hadoop-virtual-machine:~$for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done


Step 7: Create the /tmp Directory

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -mkdir /tmp 
unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -chmod -R 1777 /tmp


Step 8: Create the MapReduce system directories

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred


Step 9: Verify the HDFS File Structure

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -ls -R /


Step 10: Start MapReduce

unmesha@unmesha-hadoop-virtual-machine:~$for x in `cd /etc/init.d ; ls hadoop-0.20-mapreduce-*` ; do sudo service $x start ; done


Step 11: Set up user directory

unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -mkdir /user/<your username>unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -chown <user> /user/<your username> 
unmesha@unmesha-hadoop-virtual-machine:~$sudo -u hdfs hadoop fs -mkdir /user/unmesha/new


Step 12: Run grep example, you can also try out wordcount example

unmesha@unmesha-hadoop-virtual-machine:~$/usr/bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep input output 'dfs[a-z.]+'

Step 13: You can also stop the services

unmesha@unmesha-hadoop-virtual-machine:~$for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x stop ; done

unmesha@unmesha-hadoop-virtual-machine:~$for x in `cd /etc/init.d ; ls hadoop-0.20-mapreduce-*` ; do sudo service $x stop ; done


Happy Hadooping ...


1 comment:

  1. Fortunately, Apache Hadoop is a tailor-made solution that delivers on both counts, by turning big data insights into actionable business enhancements for long-term success. To know more, visit Hadoop Training Bangalore

    ReplyDelete