Installing Hadoop 2.3.0 – Chapter 1

Steps :

Step 1 : Installing jdk & configure ssh password less login between Hadoop cluster nodes.

Step 2 : Download Apache Hadoop 2.3.0 and configure the cluster nodes

Let us see the steps in Details  :

Step 1 :  Installing jdk, setting environmental variables, configure ssh password less login between Hadoop cluster nodes.

1.1  Log in to mn1, mn2 & dn1 as root user and create a user account (hduser), repeat below step in all the Hadoop cluster nodes(mn1,mn2 & dn1).

#useradd hduser

Set the password for hduser

#passwd hduser

1.2   Add hosts entries in /etc/hosts in all the nodes in the cluster, repeat below steps in all the nodes in the cluster.

[root@mn1 ~]#vi /etc/hosts

192.168.1.39  mn1

192.168.1.57 mn2

192.168.1.72  dn1

Save and Exit!

1.3  Download jdk-7u51-linux-i586.rpm in mn1 and copy to all cluster nodes (/opt/) , repeat below step in all the nodes in cluster (mn1,mn2 & dn1)

http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

#cd /opt/

#wget –no-check-certificate –no-cookies –header “Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com” “http://download.oracle.com/otn-pub/java/jdk/7u51-b13/jdk-7u51-linux-i586.rpm

Install jdk in mn1 and mn2 and dn1

#rpm –ivh /opt/ jdk-7u51-linux-i586.rpm

copy downloaded jdk package from mn1 to mn2 and dn1 as root user

#scp jdk-7u51-linux-i586.rpm mn2:/opt

#scp jdk-7u51-linux-i586.rpm dn1:/opt

Note :- Make sure to remove the existing version of the jdk / jre

1.4   Log in to mn1, mn2 & dn1 as hduser and generate ssh key (public and private keys) in all the cluster nodes.

switch to hduser account from root user in all the nodes

mn1

[hduser@mn1~]$ ssh-keygen -t rsa

mn2

[hduser@mn2~]$ ssh-keygen -t rsa

Dn1

[hduser@dn1~]$ ssh-keygen -t rsa

1.5  Copy from mn1 id_rsa.pub contents in mn1, mn2 and dn1 in authorized_keys, apply below steps for all the nodes in the cluster and change the file permission for id_rsa.pub and authorized keys

[hduser@mn1~]$cd  .ssh

[hduser@mn1 .ssh]$ ssh-copy-id hduser@mn1

[hduser@mn1 .ssh]$ ssh-copy-id hduser@mn2

[hduser@mn1 .ssh]$ ssh-copy-id hduser@dn1

1.6  Set the file permission for id_rsa.pub and authorized_keys in the all the nodes in the cluster, repeat below steps in (mn2 & dn1)

[hduser@mn1 .ssh]$chmod 600 id_rsa.pub  authorized_keys

1.7  Verify ssh password less login between cluster nodes (mn1,mn2 & dn1), repeat below step in  all the nodes in the cluster.

[hduser@mn1~]$ssh mn1

[hduser@mn1~]$ssh mn2

[hduser@mn1~]$ssh dn1

Step 2 :  Download Apache Hadoop 2.3.0 and configure the cluster nodes

2.1   Download latest build for Apache Hadoop from below location, in our guide, we are using hadoop-2.3.0.tar.gz

http://apache.mirrors.pair.com/hadoop/common/hadoop-2.3.0/hadoop-2.3.0.tar.gz

Download apache hadoop source package using wget command as shown below.

[hduser@mn1~]$wget http://apache.mirrors.pair.com/hadoop/common/hadoop-2.3.0/hadoop-2.3.0.tar.gz

Extract

[hduser@mn1~]$tar –zxvf hadoop-2.3.0.tar.gz

Rename Hadoop-2.3.0 to 2.3.0

[hduser@mn1~]$mv hadoop-2.3.0  2.3.0

2.2  Modify  (.bash_profile)  to set environmental variables required for Hadoop and jdk.

[hduser@mn1~]$vi .bash_profile

export JAVA_HOME=/usr/java/jdk1.7.0_45/

export HADOOP_PREFIX=”$HOME/2.3.0″

export PATH=$PATH:$HADOOP_PREFIX/bin

export PATH=$PATH:$HADOOP_PREFIX/sbin

export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}

export HADOOP_COMMON_HOME=${HADOOP_PREFIX}

export HADOOP_HDFS_HOME=${HADOOP_PREFIX}

export YARN_HOME=${HADOOP_PREFIX}

# Native Path

export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native

export HADOOP_OPTS=”-Djava.library.path=$HADOOP_PREFIX/lib”

Save and Exit!

Source the .bash_profile to load the new settings

[hduser@mn1~]$source .bash_profile

2.3

·  Apache Hadoop cluster main configuration files locates in /home/hduser/2.3.0/etc/hadoop

·  core-site.xml , hadoop-env.sh, hdfs-site.xml, mapred-site.xml, yarn-env.sh, yarn-env.sh, slaves

·  Apache Hadoop cluster executable scripts are located in /home/hduser/2.3.0/sbin and /home/hduser/2.3.0/bin

·  Apache Hadoop cluster logs  are located in /home/hduser/2.3.0 /logs (logs will be created once hadoop started..)

2.4  Apache Hadoop cluster main configuration files are as shown below.

·  core-site.xml

·  hadoop-env.sh

·  hdfs-site.xml

·  mapred-site.xml

·  yarn-env.sh

·  yarn-env.sh

·  slaves

Thank You.

For more details you can watch video and also subscribe for more Videos :

Both comments and pings are currently closed.

Comments are closed.

Copyright ©Solutions@Experts.com
Copyright © NewWpThemes Techmark Solutions - www.techmarksolutions.co.uk