hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Long <br...@dotspots.com>
Subject How to deal with HDFS failures properly
Date Fri, 27 Feb 2009 01:30:15 GMT
I'm wondering what the proper actions to take in light of a NameNode or
DataNode failure are in an application which is holding a reference to a
FileSystem object.
* Does the FileSystem handle all of this itself (e.g. reconnect logic)?
* Do I need to get a new FileSystem using .get(Configuration)?
* Does the FileSystem need to be closed before re-getting?
* Do the answers to these questions depend on whether it's a NameNode or
DataNode that's failed?

In short, how does an application (not a Hadoop job -- just an app using
HDFS) properly recover from a NameNode or DataNode failure? I haven't
figured out the magic juju yet and my applications are not handling DFS
outages gracefully.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message