hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: namenode not starting
Date Mon, 27 Aug 2012 07:30:26 GMT
Abhay,

On Mon, Aug 27, 2012 at 11:19 AM, Abhay Ratnaparkhi
<abhay.ratnaparkhi@gmail.com> wrote:
> Thank you Harsh,
>
> I have set "dfs.name.dir" explicitly. Still don't know why the data loss has
> happened.
>
> <property>
>   <name>dfs.name.dir</name>
>   <value>/wsadfs/${host.name}/name</value>
>   <description>Determines where on the local filesystem the DFS name node
>       should store the name table.  If this is a comma-delimited list
>       of directories then the name table is replicated in all of the
>       directories, for redundancy. </description>
> </property>

Sorry, I missed you had said NFS above. Is the data not present at all
in that directory there?

> The secondary namenode was same as namenode. Does this affect  anyway since
> path of "dfs.name.dir" were same?
> I have now configured another machine as secondary namenode.
> I have now  formatted the filesystem since not seen any way of recovering.
>
> I have some questions.
>
> 1. Apart from setting secondary namenode what are the other techniques used
> for namenode directory backups?

Duplicate dfs.name.dir directories are what we use in production. That
is, at least two paths, one local FS and another NFS mounted:

dfs.name.dir = /path/to/local/dfs/name,/path/to/nfs/dfs/name

This will give you two copies of good metadata, and loss of one can
still be handled.

> 2. Is there any way or tools to recover some of the data even if namenode
> crashes.

If there's any form of fsimage/edits left, a manual/automated recovery
can be made via tools such as oiv/oev and the NN's "-recover" flag, if
your version has it, or even with a hexdump and some time.

If there's no trace of fsimage files, its backups from any date, any
SNN checkpoints from past, then the metadata is all gone and there's
no recovery.

-- 
Harsh J

Mime
View raw message