Saturday, 30 April 2016

Apache Hadoop installation on aws, Redhat linux

Apache Hadoop installation

prerequisite :
1. aws account.

Steps :

1) update the package on all instances.
cmd : sudo yum update

2) install wget
sudo yum install wget

install jdk 1.7.
ref :
1) oracle
2) redhat

3) sudo yum install java-1.7.0-openjdk-devel

a) now check java -version
cmd : java -version

b) know path of java
readlink -f $(which java)

c) set java path in linux .
go to
sudo vi /etc/profile

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-
export JRE_HOME=/usr/lib/jvm/java-1.7.0-openjdk-
export PATH=$PATH:/usr/lib/jvm/java-1.7.0-openjdk-

4) install hadoop (2.7)

ref : hadoop release
cmd : sudo wget
b) extract file

cmd : sudo tar -xvf hadoop-2.7.2.tar.gz

c) move to local
cmd :  sudo mv hadoop-2.7.2 /usr/local/

5) configure environment varibale for hadoop in rhel
a) open profile to edit
cmd :  sudo vi /etc/profile

export JAVA_HOME=’path to jdk /java/jdk1.7.0_25′
export HADOOP_HOME=’‘/usr/local/hadoop-2.7.2’

b) Configure and set the java path.

b.1) go to hadoop directory

sudo vi /etc/hadoop/
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-

c) edit core-site.xml
sudo vi /etc/hadoop/core-site.xml

add property configuration


1) add the port 9000 in security group in aws
2) add CIDR block or a Security Group ID.

    <value>hdfs://'your ip':9000</value>

d) edit mapred-site.xml.template

sudo vi mapred-site.xml.template

add the following property


e) edit hdfs-site.xml

add the following property.




6) edit nodenode.

goto bin/

cmd : hdfs namenode -format

7) start the hadoop cluster or
Start NameNode daemon and DataNode daemon
cmd :  sbin/


online resource :
1) HUE
2) Demo 

1 comment: