INSTALLING HADOOP,HIVE,DERBY ON CENTOS
Please subscribe to my site www.jamesjara.com to get more tutorials.
INSTALLING HADOOP IN centos 6
INSTALLING HIVE IN centos 6
INSTALLING DERBY IN centos 6
hadoop-0.20.203.0rc1
this is the guide for the installation of Hadoop ecosystem,
is very extended so please follow step by step
====INSTALLATION=====
1. Installing java
yum install sun-java6-jdk
2.Adding a dedicated user for hadoop
This will add the user hdoopuser and the group hdoopgroup to your local machine.
/usr/sbin/useradd hdoopuser
groupadd hdoopgroup
usermod -a -G hdoopgroup hdoopuser
3.Configuring SSH
su - hdoopuser #login as hdoopuser
ssh-keygen -t rsa -P "" #generate key without password
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys #enable the new key
chmod 0600 $HOME/.ssh/authorized_keys #enable empty password
4.Disabling IPv6
sed -i 's/^\(NETWORKING\s*=\s*\).*$/\NETWORKING=NO/' /etc/sysconfig/network
5.Installation/Conf/startup of Hadoop
mkdir /hadoop
chown -R hdoopuser /hadoop
cd /hadoop/
wget http://mirrors.abdicar.com/Apache-HTTP-Server//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
tar -xvzf hadoop-0.20.203.0rc1.tar.gz
ln -s /hadoop/hadoop-0.20.203.0rc1/ /hadoop/hadoop
cd /hadoop/hadoop
#basic config
1)
vim conf/core-site.xml
#Add the following inside the <configuration> tag
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000/</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
2)
vim conf/hdfs-site.xml
#Add the following inside the <configuration> tag
<property>
<name>dfs.name.dir</name>
<value>/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
3)
vim conf/mapred-site.xml
#Add the following inside the <configuration> tag
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
4)
vim conf/hadoop-env.sh
export JAVA_HOME=/opt/jre/
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
5)
Fomart nodes
su - hdoopuser
cd /hadoop/hadoop
bin/hadoop namenode -format
6)Start hadoop
bin/start-all.sh
notes: HTTP CONSOLE OF HADOOP
http://localhost:50030/ for the jobtrackeR
http://localhost:50070/ for the namenode
5.Installation/Conf/startup of Hive/Derby
cd /hadoop
wget http://mirrors.ucr.ac.cr/apache//hive/stable/hive-0.8.1-bin.tar.gz
tar -xvzf hive-0.8.1-bin.tar.gz
ln -s /hadoop/hive-0.8.1-bin/ /hadoop/hive
export HADOOP_HOME=/hadoop/hadoop/
cd /hadoop/hive
mv conf/hive-default.xml.template conf/hive-site.xml
#test hive
bin/hive
> show tables;
#installing derby metadatastore
cd /hadoop
wget http://archive.apache.org/dist/db/derby/db-derby-10.4.2.0/db-derby-10.4.2.0-bin.tar.gz
tar -xzf db-derby-10.4.2.0-bin.tar.gz
ln -s db-derby-10.4.2.0-bin derby
mkdir derby/data
export DERBY_INSTALL=/hadoop/derby/
export DERBY_HOME=/hadoop/derby/
export HADOOP=/hadoop/hadoop/bin/hadoop
vim /hadoop/hadoop/bin/start-dfs.sh
#add to the file start-dfs.sh the next 2 lines
cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &
vim /hadoop/hadoop/bin/start-all.sh
#add to the file start-all.sh the next 2 lines
cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &
#HIVE CONF
vim /hadoop/hive/conf/hive-site.xml #installing web panel for hive , search and replace
#search for "javax.jdo.option.ConnectionURL" and edit like the following
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
#HTTP CONSOLE OF HIVE
bin/hive --service hwi &
URL: http://localhost:9999/
#create new file
vim /hadoop/hive/conf/jpox.properties
#add the following
javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox/usr/share/javadoc/java-1.6.0-openjdk/jre/.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSccp /hadoop/derby/lib/derbytools.jar /hadoop/hive/libhema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://localhost:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine
#now copy derby jar sources to Hive lib
cp /hadoop/derby/lib/derbyclient.jar /hadoop/hive/lib
cp /hadoop/derby/lib/derbytools.jar /hadoop/hive/lib
#HTTP CONSOLE OF HIVE
http://localhost:9999/hwi/ for the hive
6.START CLUSTER
/hadoop/hadoop/bin/start-all.sh
/hadoop/hive/bin/hive --service hwi & #hwi=webpanel
7. FOR NEXT TIME AND EVER. Create a bash profile
vi /etc/profile
export JAVA_HOME=/opt/jre/
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export HADOOP_HOME=/hadoop/hadoop/
export DERBY_INSTALL=/hadoop/derby/
export DERBY_HOME=/hadoop/derby/
export HADOOP=/hadoop/hadoop/bin/hadoop
======RUNNING======
PANELS:
http://localhost:50030/ for the jobtrackeR
http://localhost:50060/ for the tasktracker
http://localhost:50070/ for the namenode
http://localhost:9999/hwi/ for the hive
INSTALLING HADOOP IN centos 6
INSTALLING HIVE IN centos 6
INSTALLING DERBY IN centos 6
hadoop-0.20.203.0rc1
this is the guide for the installation of Hadoop ecosystem,
is very extended so please follow step by step
====INSTALLATION=====
1. Installing java
yum install sun-java6-jdk
2.Adding a dedicated user for hadoop
This will add the user hdoopuser and the group hdoopgroup to your local machine.
/usr/sbin/useradd hdoopuser
groupadd hdoopgroup
usermod -a -G hdoopgroup hdoopuser
3.Configuring SSH
su - hdoopuser #login as hdoopuser
ssh-keygen -t rsa -P "" #generate key without password
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys #enable the new key
chmod 0600 $HOME/.ssh/authorized_keys #enable empty password
4.Disabling IPv6
sed -i 's/^\(NETWORKING\s*=\s*\).*$/\NETWORKING=NO/' /etc/sysconfig/network
5.Installation/Conf/startup of Hadoop
mkdir /hadoop
chown -R hdoopuser /hadoop
cd /hadoop/
wget http://mirrors.abdicar.com/Apache-HTTP-Server//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
tar -xvzf hadoop-0.20.203.0rc1.tar.gz
ln -s /hadoop/hadoop-0.20.203.0rc1/ /hadoop/hadoop
cd /hadoop/hadoop
#basic config
1)
vim conf/core-site.xml
#Add the following inside the <configuration> tag
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000/</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
2)
vim conf/hdfs-site.xml
#Add the following inside the <configuration> tag
<property>
<name>dfs.name.dir</name>
<value>/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
3)
vim conf/mapred-site.xml
#Add the following inside the <configuration> tag
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
4)
vim conf/hadoop-env.sh
export JAVA_HOME=/opt/jre/
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
5)
Fomart nodes
su - hdoopuser
cd /hadoop/hadoop
bin/hadoop namenode -format
6)Start hadoop
bin/start-all.sh
notes: HTTP CONSOLE OF HADOOP
http://localhost:50030/ for the jobtrackeR
http://localhost:50070/ for the namenode
5.Installation/Conf/startup of Hive/Derby
cd /hadoop
wget http://mirrors.ucr.ac.cr/apache//hive/stable/hive-0.8.1-bin.tar.gz
tar -xvzf hive-0.8.1-bin.tar.gz
ln -s /hadoop/hive-0.8.1-bin/ /hadoop/hive
export HADOOP_HOME=/hadoop/hadoop/
cd /hadoop/hive
mv conf/hive-default.xml.template conf/hive-site.xml
#test hive
bin/hive
> show tables;
#installing derby metadatastore
cd /hadoop
wget http://archive.apache.org/dist/db/derby/db-derby-10.4.2.0/db-derby-10.4.2.0-bin.tar.gz
tar -xzf db-derby-10.4.2.0-bin.tar.gz
ln -s db-derby-10.4.2.0-bin derby
mkdir derby/data
export DERBY_INSTALL=/hadoop/derby/
export DERBY_HOME=/hadoop/derby/
export HADOOP=/hadoop/hadoop/bin/hadoop
vim /hadoop/hadoop/bin/start-dfs.sh
#add to the file start-dfs.sh the next 2 lines
cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &
vim /hadoop/hadoop/bin/start-all.sh
#add to the file start-all.sh the next 2 lines
cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &
#HIVE CONF
vim /hadoop/hive/conf/hive-site.xml #installing web panel for hive , search and replace
#search for "javax.jdo.option.ConnectionURL" and edit like the following
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
#HTTP CONSOLE OF HIVE
bin/hive --service hwi &
URL: http://localhost:9999/
#create new file
vim /hadoop/hive/conf/jpox.properties
#add the following
javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox/usr/share/javadoc/java-1.6.0-openjdk/jre/.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSccp /hadoop/derby/lib/derbytools.jar /hadoop/hive/libhema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://localhost:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine
#now copy derby jar sources to Hive lib
cp /hadoop/derby/lib/derbyclient.jar /hadoop/hive/lib
cp /hadoop/derby/lib/derbytools.jar /hadoop/hive/lib
#HTTP CONSOLE OF HIVE
http://localhost:9999/hwi/ for the hive
6.START CLUSTER
/hadoop/hadoop/bin/start-all.sh
/hadoop/hive/bin/hive --service hwi & #hwi=webpanel
7. FOR NEXT TIME AND EVER. Create a bash profile
vi /etc/profile
export JAVA_HOME=/opt/jre/
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export HADOOP_HOME=/hadoop/hadoop/
export DERBY_INSTALL=/hadoop/derby/
export DERBY_HOME=/hadoop/derby/
export HADOOP=/hadoop/hadoop/bin/hadoop
======RUNNING======
PANELS:
http://localhost:50030/ for the jobtrackeR
http://localhost:50060/ for the tasktracker
http://localhost:50070/ for the namenode
http://localhost:9999/hwi/ for the hive
Comentarios
Publicar un comentario