hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yoram Arnon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-96) name server should log decisions that affect data: block creation, removal, replication
Date Thu, 30 Mar 2006 23:07:47 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-96?page=comments#action_12372589 ] 

Yoram Arnon commented on HADOOP-96:
-----------------------------------

the plan is to add a log line for each change in the name space and each change in block placement
or replication. What we get is effectively a trace of program execution for DFS changes.
the log will go to a new log object, to enable switching this (extensive) logging on or off.
name space changes will be logged at level fine, block commit changes at finer, and block
pending changes at finest.
In order to facilitate tracing of multiple concurrent operations, each line will include the
thread id of the name server's thread. For that we derive a logging class, that places the
thread id right after the date/time.

we log in the following methods of class name node, and in methods of class nameSystem called
by them:
create (startFile)
abandonFileInProgress (abandonFileInProgress )
AbandonBlock (AbandonBlock )
reportWrittenBlock (blockReceived)
addBlock (getAdditionalBlock)
Complete (completeFile)
rename (renameTo)
delete (delete)
Mkdirs (Mkdirs)
sendHeartbeat (getHeartbeat)
blockReport (processReoprt)
blockReceived (blockReceived)
errorReport
getBlockWork (pendingTransfer, blocksToInvalidate)


> name server should log decisions that affect data: block creation, removal, replication
> ---------------------------------------------------------------------------------------
>
>          Key: HADOOP-96
>          URL: http://issues.apache.org/jira/browse/HADOOP-96
>      Project: Hadoop
>         Type: Improvement
>   Components: dfs
>     Versions: 0.1
>     Reporter: Yoram Arnon
>     Priority: Critical

>
> currently, there's no way to analyze and debug DFS errors where blocks disapear.
> name server should log its decisions that affect data, including block creation, removal,
replication:
> - block <b> created, assigned to datanodes A, B, ...
> - datanode A dead, block <b> underreplicated(1), replicating to datanode C
> - datanode B dead, block <b> underreplicated(2), replicating to datanode D
> - datanode A alive, block <b> overreplicated, removing from datanode D
> - block <removed> from datanodes C, D, ...
> that will enable me to track down, two weeks later, a block that's missing from a file,
and to debug the name server.
> extra credit:
> - rotate log file, as it might grow large
> - make this behaviour optional/configurable

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message