hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bourne1900 <bourne1...@yahoo.cn>
Subject Re: Re: could not complete file...
Date Tue, 18 Oct 2011 10:15:41 GMT
Thank you for your reply.

There is "PIPE ERROR" in datanode log, and nothing else. 
Client only shows "Could not complete file" ceaselessly.

From "namonodeIP:50070/dfshealth.jsp ", I found the datanode is hang-up, and there is only
a datanode in my cluster :)

BTW, the retry times is unlimit I think, my hadoop version is 0.20.2, the DataNode.java is
--------------------------------
while (!fileComplete) {
          fileComplete = namenode.complete(src, clientName);
          if (!fileComplete) {
            try {
              Thread.sleep(400);
              if (System.currentTimeMillis() - localstart > 5000) {
                LOG.info("Could not complete file " + src + " retrying...");
              }
            } catch (InterruptedException ie) {
            }
          }
        }
--------------------------------

bourne1900 

Sender: Uma Maheswara Rao G 72686
Date: 2011年10月18日(星期二) 下午6:00
To: common-user
CC: common-user
Subject: Re: could not complete file...
----- Original Message -----
From: bourne1900 <bourne1900@yahoo.cn>
Date: Tuesday, October 18, 2011 3:21 pm
Subject: could not complete file...
To: common-user <common-user@hadoop.apache.org>

> Hi,
> 
> There are 20 threads which put file into HDFS ceaseless, every 
> file is 2k.
> When 1 million files have finished, client begin throw "coulod not 
> complete file" exception  ceaseless.
 Could not complete file log is actually info log. This will be logged from client when closing
the file. It will retry for some time (i remember 100 times) to ensure the suuceefull writes.
Did you observe any write failures here?

> At that time, datanode is hang-up.
> 
> I think maybe heart beat is lost, so namenode does not know the 
> state of datanode. But I do not know why heart beat have lost. Is 
> there any info can be found from log when datanode can not send 
> heart beat?
Can you check the NN UI to verify the number of live nodes. By this we can decide whether
DN stopped sending heartbeats or not.  
> 
> Thanks and regards!
> bourne

Regards,
Uma
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message