hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Re: what happen in my hadoop cluster?
Date Wed, 27 Jul 2011 07:45:43 GMT
Ok, are your DNs healthy, i.e., are they reporting heartbeats on that
web UI page or is their last contact gone quite a few secs/minutes
ago? Did you restart them or their machines recently? What was their
configured dfs.data.dir?

The trouble here is that your NN isn't finding back enough blocks from
the DN reports so far. A common reason is that the DNs are not all
properly up _yet_. I'd check if DNs are alright, and look into their
logs and data directories to ensure presence of blocks and then dig
deeper into.

Note that sometimes DNs may take over a minute to start, so you can
look at the logs and see if you need to wait before you investigate
further.

2011/7/27 周俊清 <2houjq@163.com>:
> Yes,I can see all the data node from web
> page:http://dn224.pengyun.org:50070/dfsnodelist.jsp?
> --
> ----------------------------
> 周俊清
> 2houjq@163.com
>
> 在 2011-07-27 15:30:37,"Harsh J" <harsh@cloudera.com> 写道:
>>Are all your DataNodes up?
>>
>>2011/7/27 周俊清 <2houjq@163.com>:
>>> hello everyone,
>>>     I got an exception from my jobtracker's log file as follow:
>>> 2011-07-27 01:58:04,197 INFO org.apache.hadoop.mapred.JobTracker: Cleaning
>>> up the system directory
>>> 2011-07-27 01:58:04,230 INFO org.apache.hadoop.mapred.JobTracker: problem
>>> cleaning system directory:
>>> hdfs://dn224.pengyun.org:56900/home/hadoop/hadoop-tmp203/mapred/system
>>> org.apache.hadoop.ipc.RemoteException:
>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
>>> /home/hadoop/hadoop-tmp203/mapred/system. Name node is in safe mode.
>>> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990.
>>> Safe mode will be turned off automatically.
>>>     at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1851)
>>>     at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1831)
>>>     at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:691)
>>>     ……
>>> and
>>>    the log message of namenode:
>>> 2011-07-27 00:00:00,219 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 56900, call delete(/home/hadoop/hadoop-tmp203/mapred/system,
>>> true) from 192.168.1.224:5131
>>> 2: error: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot
>>> delete /home/hadoop/hadoop-tmp203/mapred/system. Name node is in safe mode.
>>> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990.
>>> Safe mode will be turned off automatically.
>>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete
>>> /home/hadoop/hadoop-tmp203/mapred/system. Name node is in safe mode.
>>> The ratio of reported blocks 0.2915 has not reached the threshold 0.9990.
>>> Safe mode will be turned off automatically.
>>>     at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1851)
>>>     at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1831)
>>>     at
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:691)
>>>     at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>>>     at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523)
>>>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383)
>>>     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379)
>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>     at javax.security.auth.Subject.doAs(Subject.java:396)
>>>     at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377)
>>>
>>>  It means,I think,that the namenode is always being in the safe mode,so what
>>> can i do about these exception.Anyone who can tell me why?I don't find the
>>> file "/home/hadoop/hadoop-tmp203/mapred/system" in my system.The exception
>>> upon which appearing in the log file are repeated,even when I restart my
>>> hadoop.
>>>    thanks for your concern.
>>>
>>>
>>> ----------------------------
>>> Junqing Zhou
>>> 2houjq@163.com
>>>
>>>
>>>
>>>
>>
>>
>>
>>--
>>Harsh J
>
>
>



-- 
Harsh J

Mime
View raw message