Big Data Board: Hadoop Installation (CDH4

Thursday, March 20, 2014

Hadoop Installation (CDH4 - Yum installation)

Hi Folks,

Today we are going for yum installation of CDH4. its pretty easy one.

Requirement

Oracle JDK 1.6
CentOS 6.4

Installation

1. Downloading the CDH4 Repo file

sudo wget -O /etc/yum.repos.d/cloudera-cdh4.repo http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/cloudera-cdh4.repo

2. Download cloudera cdh4

sudo yum install hadoop-0.20-conf-pseudo

3. Formatting the namenode

sudo -u hdfs hdfs namenode -format

4.Starting HDFS Services on respective nodes

Namenode Services on Master Node
sudo service hadoop-hdfs-namenode start
sudo service hadoop-hdfs-secondarynamenode start

Datanode Services on Master Node(becoz its pseudo mode)

sudo service hadoop-hdfs-datanode start

5. Creating Hdfs Directories on Master

sudo -u hdfs hadoop fs -mkdir /tmp
sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
sudo -u hdfs hadoop fs -mkdir /user

6. Creating Map-reduce Directories on Master node

sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
sudo -u hdfs mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs chown hdfs:hadoop /var/lib/hadoop-hdfs/cache/mapred
sudo -u hdfs chown -R mapred /var/lib/hadoop-hdfs/cache/mapred

7. Starting Mapreduce Services on master and on Slaves

JobTracker Services on Master Node
sudo service hadoop-0.20-mapreduce-jobtracker start
TaskTracker Service on master Node
sudo service hadoop-0.20-mapreduce-tasktracker start

8. Creating Home Directory for Users like hdfs and mapred, replace $user with hdfs and mapred

sudo -u hdfs hadoop fs -mkdir /user/$USER
sudo -u hdfs hadoop fs -chown $USER /user/$USER

9. Update export in .profile

export HADOOP_HOME=/usr/lib/hadoop

10. You can check hdfs directory by

sudo -u hdfs hadoop fs -ls /

Try running any sample job by cmd below.

sudo -u hdfs hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 5 10

NOTE: Please comment you have any problem in it.

5 comments:

Sudhakar SinghApril 15, 2014 at 2:21 AM
Sir, Please help me regarding following questions. I am in trouble in middle. Thanks in advance.

1. Why these commands are not running; showing message permission denied. What purpose these sovle.

sudo -u hdfs mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

sudo -u hdfs chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging

sudo -u hdfs chown hdfs:hadoop /var/lib/hadoop-hdfs/cache/mapred

sudo -u hdfs chown -R mapred /var/lib/hadoop-hdfs/cache/mapred

2. what is hadoop-env.sh ?

3. Why this command generates message 'testfile.txt':no such file or directory?

sudo -u hdfs hadoop fs -copyFromLocal testfile.txt /user/hdfs/input/

4. Why /user/mapred directory is not shown by this command?
[root@localhost conf]# sudo -u mapred hadoop fs -ls /user/

Found 2 items

drwxr-xr-x - hdfs supergroup 0 2014-04-13 22:58 /user/hdfs

drwxr-xr-x - sudhakar supergroup 0 2014-04-12 13:19 /user/sudhakar

5. These commands are not working; showing message - no such file or directory.

bin/start-all.sh

Also other starting with bin/hadoop.

6. Is it necessary to install any IDE for programming? Can we compile the 3 java files and add the 3 .class files into single jar file and execute it. Compiling javac Mapper.java generates so many errors regarding import hadoop library. Why?
ReplyDelete
Replies
Sudhakar SinghApril 17, 2014 at 12:15 AM
1. Sir, I have tried these commands and results are follows.

bin/start-all.sh command was not running. Whats the correct way to run this command.

[root@localhost /]# sudo -u hdfs hadoop fs -ls /
Found 3 items
drwxrwxrwt - hdfs supergroup 0 2014-04-12 12:58 /tmp
drwxr-xr-x - hdfs supergroup 0 2014-04-17 11:28 /user
drwxr-xr-x - hdfs supergroup 0 2014-04-12 13:07 /var

[root@localhost /]# sudo -u hdfs hadoop fs -ls /user/
Found 3 items
drwxr-xr-x - hdfs supergroup 0 2014-04-17 02:02 /user/hdfs
drwxr-xr-x - sudhakar supergroup 0 2014-04-17 11:28 /user/mapred
drwxr-xr-x - sudhakar supergroup 0 2014-04-12 13:19 /user/sudhakar

[root@localhost /]# export HADOOP_HOME=/usr/lib/hadoop

[root@localhost /]# bin/start-all.sh
bash: bin/start-all.sh: No such file or directory

[root@localhost /]#

2. Sir, please also see these commands. It generates hadoop packages does not exits.
# Javac Mapper.java

Is it necessary to install eclipse ? How can we communicate eclipse with Hadoop.
ReplyDelete
Replies
Vikas SrivastavaApril 24, 2014 at 3:55 AM
Hi Sudhakar,

You need to make mapred owned by mapred user
drwxr-xr-x - sudhakar supergroup 0 2014-04-17 11:28 /user/mapred

after exporting you need to add into
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

Then only you are able to run start-all.sh
ReplyDelete
Replies
Sudhakar SinghApril 24, 2014 at 5:57 AM
Thanks you sir so much.

Sir, how many jar file should be added when compiling Mapper.java etc.
and what is the correct command if i have to compile Mapper.java
ReplyDelete
Replies

Add comment