hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pallavi Palleti <pallavi.pall...@corp.aol.com>
Subject Re: Query over DFSClient
Date Wed, 31 Mar 2010 11:29:54 GMT

I am looking into hadoop-20 source code for below issue. From DFSClient, 
I could see that once the datanodes given by namenode are not reachable, 
it is setting "lastException" variable to error message saying "recovery 
from primary datanode is failed N times, aborting.."(line No:2546 in 
processDataNodeError).  However, I couldn't figure out where this 
exception is thrown. I could see the throw statement in isClosed() but 
not finding the exact sequence after Streamer exits with lastException 
set to isClosed() method call. It would be great if some one could shed 
some light on this. I am essentially looking whether DFSClient 
approaches namenode in the case of failure of all datanodes that 
namenode has given for a given data block previously.


On 03/30/2010 05:01 PM, Pallavi Palleti wrote:
> Hi,
> Could some one kindly let me know if the DFSClient takes care of 
> datanode failures and attempt to write to another datanode if primary 
> datanode (and replicated datanodes) fail. I looked into the souce code 
> of DFSClient and figured out that it attempts to write to one of the 
> datanodes in pipeline and fails if it failed to write to at least one 
> of them. However, I am not sure as I haven't explored fully. If so, is 
> there a way of querying namenode to provide different datanodes in the 
> case of failure. I am sure the Mapper would be doing similar 
> thing(attempting to fetch different datanode from namenode)  if it 
> fails to write to datanodes. Kindly let me know.
> Thanks
> Pallavi

View raw message