Tuesday, March 18, 2014

Hadoop Installations (Tarball)

Hi Folks,

We have seen hadoop installation via many type like rpm, Automatic, tarball, Yum etc. Now in this blog we will do all the types of installation one by one.

Lets try with Tarball Installation.

Requirement 

  • We only require Java installed on the node 
  • JAVA_HOME should be Set.
  • Check for Iptables(should be off)
  • SElinux should be disable
  • Ports should be open (9000; 9001; 50010; 50020; 50030; 50060; 50070; 50075; 50090)
Installation

Download the tarball from the Apache official website 

wget http://archive.apache.org/dist/hadoop/core/hadoop-1.0.4/hadoop-1.0.4.tar.gz

Untar the installation

tar -xzvf hadoop-1.0.4..tar.gz 

Setting up the variables in .profile of the user

export JAVA HOME=PATH TO JDK INSTALLATION
export HADOOP HOME=/home/hadoop/project/hadoop-1.0.4
export PATH=$JAVA HOME/bin:$HADOOP HOME/bin:$PATH

update JAVA_HOME inside the hadoop-env.sh from $HADOOP_HOME/conf/hadoop-env.sh

Configuration

Editing the following files to set the different parameters for each other, these are the minimal configuration for these files.

$HADOOP_HOME/conf/core-default.xml
  • <configuration>
         <property>
             <name>fs.default.name</name>
             <value>hdfs://master:9000</value>
         </property>
    </configuration>
$HADOOP_HOME/conf/hdfs-default.xml,
  • <configuration>
         <property>
             <name>dfs.replication</name>
             <value>1</value>
         </property>
    </configuration>
$HADOOP_ HOME/conf/mapred-default.xml, 
  • <configuration>
         <property>
             <name>mapred.job.tracker</name>
             <value>localhost:9001</value>
         </property>
    </configuration>

update the slave file inside the $HADOOP_HOME/conf/slaves, make all the slave entry inside this file.

We have to do this for all the nodes to set up the hadoop cluster. After doing it for all the nodes now we can start the service after formatting the name node.

Suppose we have master as main node which will act as hadoop name node. so below are the steps we will perform in that node.

$HADOOP_HOME/bin/hadoop namenode -format

This is will format the hdfs and now we are ready to run services on all the nodes.

For Master node 

$HADOOP_HOME/sbin/hadoop name node start namenode
$HADOOP_HOME/sbin/hadoop-daemon.sh start  jobtracker
$HADOOP_HOME/sbin/hadoop-daemon.sh start secondaryNamenode

For SLAVE NODES

$HADOOP_HOME/sbin/hadoop name node start datanode
$HADOOP_HOME/sbin/hadoop name node start task tracker

Now we can check the service on below URL's

Namenode:- http://master:50070/
Jobtracker:- http://master:50030/


Above are the simplest and easiest tar ball installation of hadoop. please comment if you have any issue while installation.

No comments:

Post a Comment