hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Lederman <jon2...@gmail.com>
Subject Re: HDFS FS Commands Hanging System
Date Sun, 02 Jan 2011 16:56:53 GMT
Hi Esteban,

Thanks.  Can you tell me how I can check whether my node can resolve the host name?  I don't
know precisely how to do that.

When I run HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -ls /
I get:

# HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -ls /
11/01/02 16:52:14 DEBUG conf.Configuration: java.io.IOException: config()
	at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:211)
	at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:198)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:57)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.hadoop.fs.FsShell.main(FsShell.java:1880)

11/01/02 16:52:15 DEBUG security.UserGroupInformation: Unix Login: root,root
11/01/02 16:52:17 DEBUG security.UserGroupInformation: Unix Login: root,root
11/01/02 16:52:17 DEBUG ipc.Client: The ping interval is60000ms.
11/01/02 16:52:18 DEBUG ipc.Client: Connecting to localhost/127.0.0.1:9000
11/01/02 16:52:18 DEBUG ipc.Client: IPC Client (47) connection to localhost/127.0.0.1:9000
from root sending #0
11/01/02 16:52:18 DEBUG ipc.Client: IPC Client (47) connection to localhost/127.0.0.1:9000
from root: starting, having connections 1

Then the system hangs and does not return.  

My core-site.xml file is as follows:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
     <property>
         <name>fs.default.name</name>
         <value>hdfs://localhost:9000</value>
     </property>
</configuration>


My hdfs-site.xml file is as follows:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
     <property>
         <name>dfs.replication</name>
         <value>1</value>
     </property>
</configuration>


My mapred-site.xml file is as follows:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>

My masters and slaves files both indicate: localhost

Thanks for your help.  I really appreciate this.

-Jon
On Jan 2, 2011, at 8:47 AM, Esteban Gutierrez Moguel wrote:

> Hello Jon,
> 
> Could you please verify that your node can resolve the host name?
> 
> It would be helpful too if you can attach your configuration files and the
> output of:
> 
> HADOOP_ROOT_LOGGER=DEBUG,console hadoop fs -ls /
> 
> as Todd suggested.
> 
> Cheers,
> esteban
> On Jan 1, 2011 2:01 PM, "Jon Lederman" <jon2718@gmail.com> wrote:
>> Hi,
>> 
>> Still no luck in getting FS commands to work. I did take a look at the
> logs. They all look pretty clean with the following exceptions: The DataNode
> appears to start up fine. However, the NameNode reports that the Network
> Topology has 0 racks and 0 datanodes. Is this normal? Is it possible the
> namenode cannot talk to the datanode? Any thoughts on what might be wrong?
>> 
>> Thanks in advance and happy new year.
>> 
>> -Jon
>> 2011-01-01 19:45:27,197 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting DataNode
>> STARTUP_MSG: host = localhost/127.0.0.1
>> STARTUP_MSG: args = []
>> STARTUP_MSG: version = 0.20.2
>> STARTUP_MSG: build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
> 911707; compiled
>> by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
>> ************************************************************/
>> sc-ssh-svr1 logs $ more hadoop-root-namenode-localhost.log
>> 2011-01-01 19:45:23,988 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting NameNode
>> STARTUP_MSG: host = localhost/127.0.0.1
>> STARTUP_MSG: args = []
>> STARTUP_MSG: version = 0.20.2
>> STARTUP_MSG: build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
> 911707; compiled
>> by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
>> ************************************************************/
>> 2011-01-01 19:45:27,059 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=
>> NameNode, port=8020
>> 2011-01-01 19:45:28,355 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at:
> localhost.locald
>> omain/127.0.0.1:8020
>> 2011-01-01 19:45:28,448 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processNa
>> me=NameNode, sessionId=null
>> 2011-01-01 19:45:28,492 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing
> Name
>> NodeMeterics using context
> object:org.apache.hadoop.metrics.spi.NullContext
>> 2011-01-01 19:45:29,758 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=root,root
>> 2011-01-01 19:45:29,763 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
>> 2011-01-01 19:45:29,770 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> isPermissionEnabled=true
>> 2011-01-01 19:45:29,965 INFO
> org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
> Initializing
>> FSNamesystemMetrics using context
> object:org.apache.hadoop.metrics.spi.NullContext
>> 2011-01-01 19:45:29,994 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
> FSNamesystemStatu
>> sMBean
>> 2011-01-01 19:45:30,603 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Number of files = 1
>> 2011-01-01 19:45:30,696 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Number of files under construction
>> = 0
>> 2011-01-01 19:45:30,701 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Image file of size 94 loaded in 0 s
>> econds.
>> 2011-01-01 19:45:30,708 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Edits file /tmp/hadoop-root/dfs/nam
>> e/current/edits of size 4 edits # 0 loaded in 0 seconds.
>> 2011-01-01 19:45:30,767 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Image file of size 94 saved in 0 se
>> conds.
>> 2011-01-01 19:45:30,924 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
> FSImage in
>> 1701 msecs
>> 2011-01-01 19:45:30,945 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Total number of blocks
> = 0
>> 2011-01-01 19:45:30,948 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of invalid
> blocks = 0
>> 2011-01-01 19:45:30,958 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> under-replicated b
>> locks = 0
>> 2011-01-01 19:45:30,963 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> over-replicated b
>> locks = 0
>> 2011-01-01 19:45:30,966 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> Leaving safe mode after 1 secs.
>> 2011-01-01 19:45:30,971 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> Network topology has 0 racks and 0 dat
>> anodes
>> 2011-01-01 19:45:30,973 INFO org.apache.hadoop.hdfs.StateChange: STATE*
> UnderReplicatedBlocks has 0 blocks
>> 2011-01-01 19:45:33,929 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) vi
>> a org.mortbay.log.Slf4jLog
>> 2011-01-01 19:45:35,020 INFO org.apache.hadoop.http.HttpServer: Port
> returned by webServer.getConnectors()[0].
>> getLocalPort() before open() is -1. Opening the listener on 50070
>> 2011-01-01 19:45:35,036 INFO org.apache.hadoop.http.HttpServer:
> listener.getLocalPort() returned 50070 webServ
>> er.getConnectors()[0].getLocalPort() returned 50070
>> 2011-01-01 19:45:35,038 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50070
>> 2011-01-01 19:45:35,041 INFO org.mortbay.log: jetty-6.1.14
>> sc-ssh-svr1 logs $
>> 
>> On Dec 31, 2010, at 4:28 PM, li ping wrote:
>> 
>>> I suggest you should look through the logs to see if there is any error.
>>> And the second point that I need to point out is which node you run the
>>> command "hadoop fs -ls ". If you run the command on Node A, the
>>> configuration item "fs.default.name" should point to the HDFS.
>>> 
>>> On Sat, Jan 1, 2011 at 3:20 AM, Jon Lederman <jon2718@gmail.com> wrote:
>>> 
>>>> Hi Michael,
>>>> 
>>>> Thanks for your response. It doesn't seem to be an issue with safemode.
>>>> 
>>>> Even when I try the command dfsadmin -safemode get, the system hangs. I
> am
>>>> unable to execute any FS shell commands other than hadoop fs -help.
>>>> 
>>>> I am wondering whether this an issue with communication between the
>>>> daemons? What should I be looking at there? Or could it be something
> else?
>>>> 
>>>> When I do jps, I do see all the daemons listed.
>>>> 
>>>> Any other thoughts.
>>>> 
>>>> Thanks again and happy new year.
>>>> 
>>>> -Jon
>>>> On Dec 31, 2010, at 9:09 AM, Black, Michael (IS) wrote:
>>>> 
>>>>> Try checking your dfs status
>>>>> 
>>>>> hadoop dfsadmin -safemode get
>>>>> 
>>>>> Probably says "ON"
>>>>> 
>>>>> hadoop dfsadmin -safemode leave
>>>>> 
>>>>> Somebody else can probably say how to make this happen every reboot....
>>>>> 
>>>>> Michael D. Black
>>>>> Senior Scientist
>>>>> Advanced Analytics Directorate
>>>>> Northrop Grumman Information Systems
>>>>> 
>>>>> 
>>>>> ________________________________
>>>>> 
>>>>> From: Jon Lederman [mailto:jon2718@gmail.com]
>>>>> Sent: Fri 12/31/2010 11:00 AM
>>>>> To: common-user@hadoop.apache.org
>>>>> Subject: EXTERNAL:HDFS FS Commands Hanging System
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi All,
>>>>> 
>>>>> I have been working on running Hadoop on a new microprocessor
>>>> architecture in pseudo-distributed mode. I have been successful in
> getting
>>>> SSH configured. I am also able to start a namenode, secondary namenode,
>>>> tasktracker, jobtracker and datanode as evidenced by the response I get
> from
>>>> jps.
>>>>> 
>>>>> However, when I attempt to interact with the file system in any way
> such
>>>> as the simple command hadoop fs -ls, the system hangs. So it appears to
> me
>>>> that some communication is not occurring properly. Does anyone have any
>>>> suggestions what I look into in order to fix this problem?
>>>>> 
>>>>> Thanks in advance.
>>>>> 
>>>>> -Jon
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> -----李平
>> 


Mime
View raw message