hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: dfs.name.dir replication and disk not available
Date Fri, 28 Sep 2012 09:56:48 GMT
I don't know how much of this is 1.x compatible:

- When a transaction is logged and sync'd, and a single edits storage
location fails during write, then only that storage location is
ejected out of the regular write list and skipped over (with states
being updated for that location in the UI, etc., immediately). The
rest remain active and NN lives on. The transaction is marked
successful and returns back to client with a similar code. Further
transactions too skip the ejected storage.
- If no edit streams remain after removal (i.e. last remaining disk is
removed due to a write error), then the transaction is failed and the
NN dies down, to prevent data loss due to lack of persistence.
- Hence, a transaction at the NN can be marked complete and return a
success, iff, at least one location successfully wrote the edit log
for it.
- If dfs.name.dir.restore is enabled, then the NN checks if its
ejected storages are healthy again and re adds them. The check, IIRC,
is done during checkpoint or checkpoint-like operations currently.

I guess all of this is in 1.x too, but I haven't validated it
recently. It is certainly in the versions I've been using and
supporting at my org for quite a while now. The restore especially
comes in handy for customers/users with fairly common NFS mount
related issues, not requiring them to restart NN each time the NN
ejects the NFS out for a bad write. Although, for that to happen, a
soft mount is necessary and recommended, rather than a hard mount,
which would hang the NameNode and invalidate its whole "still
available despite a few volumes failing" feature.

Does this help Bertrand? Happy to answer any further questions.

On Fri, Sep 28, 2012 at 2:51 PM, Bertrand Dechoux <dechouxb@gmail.com> wrote:
> Hi,
> I was wondering about the safety of multiples dfs.name.dir directories.
> On one hand, we want all copies to be synchronised but on the other hand if
> a hard drive fail we would like the namenode still to be operational.
> How does that work? I know there is the source but I was hoping for a higher
> level description.
> Regards
> Bertrand
> PS : I am interested about the behaviour of the last stable version ie
> 1.0.3. Not in the old issues that were solved.

Harsh J

View raw message