Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <26224654.1164856042938.JavaMail.jira@brutus>
Date: Wed, 29 Nov 2006 19:07:22 -0800 (PST)
From: "Konstantin Shvachko (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-227) Namespace check pointing is not
 performed until the namenode restarts.
In-Reply-To: <17890828.1147909205885.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

    [ http://issues.apache.org/jira/browse/HADOOP-227?page=comments#action_12454523 ] 
            
Konstantin Shvachko commented on HADOOP-227:
--------------------------------------------

The design looks pretty simple and clean.
I still like the merging approach better. It is stand-alone!
There is no need to change anything in the name-node code.
It is useful as a maintenance utility for merging edits and images externally.
Does not lock name-node.
At some point the name-node data structures should be revised substantially
and this copy-on-write effort will most probably be a wasted effort.
Does it make sense to invest more effort in designing a simpler merge algorithm?

If we still choose to do that:
- Should we use "standard name" for current image and edits files (without .N)?
Meaning before checkpointing edits is renamed to edits.N and new edits is re-created.
- Do we need to keep all old images? Looks like just the last one is required.
This is periodic checkpointing, not a backup procedure.
- If the node crashes in the middle of the checkpoint it is left with the old image,
old edits, and new edits files. Are we going to apply both old and new edits during startup?

> Namespace check pointing is not performed until the namenode restarts.
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-227
>                 URL: http://issues.apache.org/jira/browse/HADOOP-227
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.2.0
>            Reporter: Konstantin Shvachko
>         Assigned To: Milind Bhandarkar
>
> In current implementation when the name node starts, it reads its image file, then
> the edits file, and then saves the updated image back into the image file.
> The image file is never updated after that.
> In order to provide the system reliability reliability the namespace information should
> be check pointed periodically, and the edits file should be kept relatively small.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira