hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jaylac <Jayalakshmi.Munias...@cognizant.com>
Subject RE: Detailed steps to run Hadoop in distributed system...
Date Fri, 02 Mar 2007 10:33:29 GMT


What should i do? I dont know what to do.....


Devaraj Das wrote:
> 
>> Have anyone successfully tried running hadoop in two systems?
> Of course! We have Hadoop running on clusters of 900 nodes. 
> 
>> In master node, i have a user name called "jaya"... Is it necessary to
>> create a user name called "jaya" in the slave system also... or we can
>> simply use the user name that exist in the slave machine?
> You should ideally run hadoop as the same user on all machines in the
> cluster. The shell scripts for starting/stopping hadoop daemons uses ssh
> to
> connect to the machines listed in the slaves file. Although you can
> probably
> work around that, I would recommend that you have the same user
> everywhere.
> 
> From the log messages, it looks like the host 10.229.62.6 could not
> communicate with the other host in order to start the hadoop daemons.
> Please
> address that issue first.
> 
>> -----Original Message-----
>> From: jaylac [mailto:Jayalakshmi.Muniasamy@cognizant.com]
>> Sent: Friday, March 02, 2007 1:17 PM
>> To: hadoop-user@lucene.apache.org
>> Subject: Detailed steps to run Hadoop in distributed system...
>> 
>> 
>> Hi Hadoop-Users.....
>> 
>> Have anyone successfully tried running hadoop in two systems?
>> 
>> I've tried running the wordcount example in one system.. It works fine...
>> But when i try to add nodes to the cluster and run wordcount example, i
>> get
>> errors....
>> 
>> So please let me know the detailed steps to be followed...
>> 
>> Though the steps are given in the hadoop website, i need some help from u
>> people...
>> 
>> They might have thought some steps to be obvious and would have not stold
>> that in the website...
>> 
>> Im new user... So i simply followed the instructions given... I might
>> have
>> overlooked some steps which is necessary to run it....
>> 
>> Another important doubt....
>> 
>> In master node, i have a user name called "jaya"... Is it necessary to
>> create a user name called "jaya" in the slave system also... or we can
>> simply use the user name that exist in the slave machine?
>> 
>> 
>> 
>> Im using two RED HAT LINUX machines... one master(10.229.62.6) and the
>> other
>> slave(10.229.62.56)
>> In master node, the user name is jaya
>> In slave node, the user name is 146736
>> 
>> The steps which i follow is.....
>> 
>> Edit /home/jaya/.bashrc file
>>           Here ill set the HADOOP_CONF_DIR environment variable
>> 
>> MASTER NODE
>> 
>> 1. Edit conf/slaves file....
>>         Contents
>>         ====================
>>          localhost
>>           146736@10.229.62.56
>>          ====================
>> 
>> 2. Edit conf/hadoop-en.sh file
>>          Here ill set the JAVA_HOME environment variable
>>          Thats it.... No other changes in this file....
>>          PLEASE LET ME KNOW IF I SHOULD ADD ANYTHING HERE
>> 
>> 3. Edit conf/hadoop-site.xml file
>>        Contents
>>         ===========================================
>>          <?xml version="1.0"?>
>>          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> 
>>          <!-- Put site-specific property overrides in this file. -->
>> 
>>          <configuration>
>> 
>>          <property>
>>          <name>fs.default.name</name>
>>          <value>10.229.62.6:50010</value>
>>          </property>
>> 
>>          <property>
>>          <name>mapred.job.tracker</name>
>>          <value>10.229.62.6:50011</value>
>>          </property>
>> 
>>          <property>
>>          <name>dfs.replication</name>
>>          <value>2</value>
>>          </property>
>> 
>>          /configuration>
>>          ====================================
>> 
>>          LET ME KNOW IF I NEED TO ADD ANYTHING HERE....
>> 
>> SLAVE NODE
>> 
>> 1. Edit conf/masters file....
>>         Contents
>>         ====================
>>          localhost
>>           jaya@10.229.62.56
>>          ====================
>> 
>> 2. Edit conf/hadoop-en.sh file
>>          Here ill set the JAVA_HOME environment variable
>>          Thats it.... No other changes in this file....
>>          PLEASE LET ME KNOW IF I SHOULD ADD ANYTHING HERE
>> 
>> 3. Edit conf/hadoop-site.xml file
>>        Contents
>>         ===========================================
>>          <?xml version="1.0"?>
>>          <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> 
>>          <!-- Put site-specific property overrides in this file. -->
>> 
>>          <configuration>
>> 
>>          <property>
>>          <name>fs.default.name</name>
>>          <value>10.229.62.6:50010</value>
>>          </property>
>> 
>>          <property>
>>          <name>mapred.job.tracker</name>
>>          <value>10.229.62.6:50011</value>
>>          </property>
>> 
>>          <property>
>>          <name>dfs.replication</name>
>>          <value>2</value>
>>          </property>
>> 
>>          /configuration>
>>          ====================================
>> 
>>          LET ME KNOW IF I NEED TO ADD ANYTHING HERE....
>> 
>> I've already done steps for passwordless login
>> 
>> Thats is all........... Then ill perform the following operations....
>> 
>> In the HADOOP_HOME directory,
>> 
>> jaya@localhost hadoop-0.11.0]$ bin/hadoop namenode -format
>> Re-format filesystem in /tmp/hadoop-146736/dfs/name ? (Y or N) Y
>> Formatted /tmp/hadoop-146736/dfs/name
>> [jaya@localhost hadoop-0.11.0]$
>> 
>> Then
>> 
>> [jaya@localhost hadoop-0.11.0]$ bin/start-all.sh
>> starting namenode, logging to
>> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-namenode-
>> localhost.localdomain.out
>> localhost: starting datanode, logging to
>> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-datanode-
>> localhost.localdomain.out
>> 146736@10.229.62.56: ssh: connect to host 10.229.62.56 port 22: No route
>> to
>> host
>> localhost: starting secondarynamenode, logging to
>> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-secondarynamenode-
>> localhost.localdomain.out
>> starting jobtracker, logging to
>> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-jobtracker-
>> localhost.localdomain.out
>> localhost: starting tasktracker, logging to
>> /opt/hadoop-0.11.0/bin/../logs/hadoop-jaya-tasktracker-
>> localhost.localdomain.out
>> 146736@10.229.62.56: ssh: connect to host 10.229.62.56 port 22: No route
>> to
>> host
>> [jaya@localhost hadoop-0.11.0]$
>> 
>> [jaya@localhost hadoop-0.11.0]$ mkdir input
>> [jaya@localhost hadoop-0.11.0]$ cp conf/*.xml input
>> [jaya@localhost hadoop-0.11.0]$
>> 
>> [jaya@localhost hadoop-0.11.0]$ bin/hadoop dfs -put input input
>> jaya@localhost hadoop-0.11.0]$ bin/hadoop dfs -lsr /
>> /tmp    <dir>
>> /tmp/hadoop-jaya        <dir>
>> /tmp/hadoop-jaya/mapred <dir>
>> /tmp/hadoop-jaya/mapred/system  <dir>
>> /user   <dir>
>> /user/jaya      <dir>
>> /user/jaya/input        <dir>
>> /user/jaya/input/hadoop-default.xml     <r 2>   21708
>> /user/jaya/input/hadoop-site.xml        <r 2>   1333
>> /user/jaya/input/mapred-default.xml     <r 2>   180
>> [jaya@localhost hadoop-0.11.0]$
>> 
>> 
>> 
>> [jaya@localhost hadoop-0.11.0]$ bin/hadoop dfs -ls input
>> Found 3 items
>> /user/jaya/input/hadoop-default.xml     <r 2>   21708
>> /user/jaya/input/hadoop-site.xml        <r 2>   1333
>> /user/jaya/input/mapred-default.xml     <r 2>   180
>> [jaya@localhost hadoop-0.11.0]$ bin/hadoop dfs -ls output
>> Found 0 items
>> [jaya@localhost hadoop-0.11.0]$ bin/hadoop jar hadoop-0.11.0-examples.jar
>> wordcount input output
>> java.net.SocketTimeoutException: timed out waiting for rpc response
>>         at org.apache.hadoop.ipc.Client.call(Client.java:469)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:164)
>>         at $Proxy1.getProtocolVersion(Unknown Source)
>>         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:248)
>>         at org.apache.hadoop.mapred.JobClient.init(JobClient.java:200)
>>         at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:192)
>>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:381)
>>         at org.apache.hadoop.examples.WordCount.main(WordCount.java:143)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>> 39)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>> pl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriv
>> er.java:71)
>>         at
>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:143)
>>         at
>> org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:40)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>> 39)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>> pl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
>> [jaya@localhost hadoop-0.11.0]$
>> 
>> 
>> I dont know where the problem is.......
>> 
>> I've not created any directory called output.... if at all we need to
>> create
>> one, where should we create?
>> Should i configure some more settings.... Please explain in detail....
>> 
>> Please do help me.....
>> 
>> Thanks in advance
>> Jaya
>> --
>> View this message in context:
>> http://www.nabble.com/Detailed-steps-to-run-
>> Hadoop-in-distributed-system...-tf3332250.html#a9265480
>> Sent from the Hadoop Users mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Detailed-steps-to-run-Hadoop-in-distributed-system...-tf3332250.html#a9267312
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
View raw message