You are on page 1of 5

Ex No: 04 Find procedure to set up the one node Hadoop cluster.

Date :

Aim:
To write a procedure to set up the one node Hadoop cluster using Hadoop.
Procedure:
Step 1: Installing Java
Goto terminal and login as a root user
ssk@ubuntu:~$ sudo bash
root@ubuntu:~# apt-get install openjdk-8-jdk
Run this command, to make sure java has been installed
root@ubuntu:~# java -version
root@ubuntu:~# javac -version
Step 2: Setting JAVA_HOME variable
Run this command, as root to get java path
root@ubuntu:~# update-alternatives --config java
Edit /etc/environment:
root@ubuntu:~# gedit /etc/environment
Add this line to the file:
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-i386
Run this command:
root@ubuntu:~# source /etc/environment
Run this command to make sure the variable was added successfully:
root@ubuntu:~# echo $JAVA_HOME
Step 3: Installing SSH
Run this command:
root@ubuntu:~# apt-get install ssh
Now, generate public/private rsa key pair:
root@ubuntu:~# ssh-keygen -t rsa -P
Make the generated public key authorized by running:
root@ubuntu:~# cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Check if SSH works by running this command:
root@ubuntu:~# ssh localhost
Step 4: Download and extract hadoop package
Now, as the logged in local user, unpack the hadoop package:
ssk@ubuntu:~$ tar -xvzf /home/ssk/Downloads/hadoop-2.7.2.tar.gz
Step 5: Installing and Configuring Hadoop
Run this command to edit .bashrc file:
ssk@ubuntu:~$ sudo gedit ~/.bashrc
Append following lines to the end of ~/.bashrc:
#HADOOP VARIABLES START
export JAVA_HOME= /usr/lib/jvm/java-8-openjdk-i386
export HADOOP_INSTALL= /home/ssk/hadoop-2.7.2
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
Now, run this command:
ssk@ubuntu:~$ source ~/.bashrc
Now, get into hadoop path:
ssk@ubuntu:~$ cd hadoop-2.7.2
ssk@ubuntu:~/hadoop-2.7.2$ cd etc
ssk@ubuntu:~/hadoop-2.7.2/etc$ cd hadoop
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$
Edit hadoop-env.sh file:
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ gedit hadoop-env.sh
Append the following line to hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-i386
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ cd
Make a directory hadoop_store in the same directory where hadoop-2.7.2
exists.
ssk@ubuntu:~$ mkdir hadoop_store
Get inside hadoop_store
ssk@ubuntu:~$ cd hadoop_store
Make a new directory hdfs inside hadoop_store
ssk@ubuntu:~/hadoop_store$ mkdir hdfs
ssk@ubuntu:~/hadoop_store$ cd hdfs
Make two directories inside hdfs namenode and datanode
ssk@ubuntu:~/hadoop_store/hdfs$ mkdir namenode
ssk@ubuntu:~/hadoop_store/hdfs$ mkdir datanode
ssk@ubuntu:~/hadoop_store/hdfs$ mkdir datanode
ssk@ubuntu:~/hadoop_store/hdfs$ cd
Make sure the directories are exist now
Now get back to hadoop-2.7.2/etc/hadoop
ssk@ubuntu:~$ cd hadoop-2.7.2/etc/hadoop
Edit hdfs-site.xml file
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ gedit hdfs-site.xml
for hdfs-site.xml:
Between <configuration> and </configuration>, append these lines:
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/ssk/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/ssk/hadoop_store/hdfs/namenode</value>
</property>
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ cd
Inside hadoop-2.7.2, make a new directory tmp
ssk@ubuntu:~$ mkdir hadoop-2.7.2/tmp
ssk@ubuntu:~$ cd hadoop-2.7.2/etc/hadoop
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$
Edit core-site.xml file:
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ gedit core-site.xml
for core-site.xml:
Between <configuration> and </configuration>, append these lines:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ssk/hadoop-2.7.2/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system.
A URI whose scheme and authority determine the FileSystem
implementation. The uri's scheme determines the config property (fs.SCHEME.impl)
naming
the FileSystem implementation class.
The uri's authority is used to determine the host, port, etc.
for a filesystem.</description>
</property>
Now, run this command:
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ cp mapred-site.xml.template mapred-
site.xml
Edit mapred-site.xml file:
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ gedit mapred-site.xml
for mapred-site.xml:
Between <configuration> and </configuration>, append these lines:
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs at.
If "local", then jobs are run in-process as a single map and reduce task.
</description>
</property>
Now, we must format hadoop file system:
ssk@ubuntu:~/hadoop-2.7.2/etc/hadoop$ cd
ssk@ubuntu:~$ hadoop namenode -format
Get inside hadoop-2.7.2/sbin directory:
ssk@ubuntu:~$ cd hadoop-2.7.2/sbin
Finally, starting hadoop:
ssk@ubuntu:~/hadoop-2.7.2/sbin$ start-all.sh
Now you can go to : browser and type it on URL
localhost:8088
To make sure everything is running well.
or you can go to : browser and type it on URL
localhost:50070
For datanodes or browsing the file system.
Result:
Thus the procedure to set up the one node Hadoop cluster was successfully executed.

You might also like