Hi Aitor,

Actually I did so in my test. But the issue is that I did not find disk full info in any log.

2014-10-20 4:00 GMT-07:00 Aitor Cedres <acedres@pivotal.io>:

Hi Sam,

You can set the property "dfs.datanode.du.reserved" to reserve some space for non-DFS use. By doing that, Hadoop daemons will keep writing to log files, and it will help you diagnose the issue.

Hope it helps.

Regards,
Aitor

On 20 October 2014 11:27, sam liu <samliuhadoop@gmail.com> wrote:
Hi Dhiraj,

My cluster only includes 1 datanode and its log does not include any warning/error messages for the out of free disk space. That wastes some of my time to find the root cause.

Also I did not find any free disk checking code in DataNode.java. So it will be better if the datanode could check the free disk frequently and write the warning/error info into its log.


2014-10-19 23:28 GMT-07:00 Dhiraj Kamble <Dhiraj.Kamble@sandisk.com>:

Formatting NameNode will cause data loss – in effect you will lose all your data on DataNodes(rather access to data on DataNodes). NameNode will have no idea where your data(files) are stored. I don’t think that’s what you’re looking for.

I am wondering why isn’t there any log information on DataNode for disk full. What version of Hadoop are you using and what’s your configuration( Single Node, Single Node Pseudo Distributed or Cluster)

 

Regards,

Dhiraj

 

From: sam liu [mailto:samliuhadoop@gmail.com]
Sent: Monday, October 20, 2014 11:51 AM
To: user@hadoop.apache.org
Subject: Re: Can add a regular check in DataNode on free disk space?

 

Hi unmesha,

Thanks for your response, but I am not clear what effect will the hadoop cluster has after applying above operations. Could you pls give more explanations?

 

2014-10-19 21:37 GMT-07:00 unmesha sreeveni <unmeshabiju@gmail.com>:

1. Stop all Hadoop daemons

2. Remove all files from

                              /var/lib/hadoop-hdfs/cache/hdfs/dfs/name

3. Format namenode

4. Start all Hadoop daemons.

 

On Mon, Oct 20, 2014 at 8:26 AM, sam liu <samliuhadoop@gmail.com> wrote:

Hi Experts and Developers,

At present, if a DataNode does not has free disk space, we can not get this bad situation from anywhere, including DataNode log. At the same time, under this situation, the hdfs writing operation will fail and return error msg as below. However, from the error msg, user could not know the root cause is that the only datanode runs out of disk space, and he also could not get any useful hint in datanode log. So I believe it will be better if we could add a regular check in DataNode on free disk space, and it will add WARNING or ERROR msg in datanode log if that datanode runs out of space. What's your opinion?

Error Msg:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hadoop/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1441)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2702)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)

Thanks!



 

--

Thanks & Regards

 

Unmesha Sreeveni U.B

Hadoop, Bigdata Developer

Center for Cyber Security | Amrita Vishwa Vidyapeetham

 

 

 




PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).