hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1093) NNBench generates millions of NotReplicatedYetException in Namenode log
Date Thu, 22 Mar 2007 23:28:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12483355

dhruba borthakur commented on HADOOP-1093:

One proposal to fix this issue:

1. Make the dfsclient *not* invoke reportWrittenBlock(). Instead, the dfsclient will instruct
each datanode in the pipeline to send blockReceived confirmation to the namenode.

2. The datanode already accumulates all pending blockReceived() calls and does them through
the offerService thread. So, no change is needed to datanode.

3. The getAdditionalBlock() method on the namenode currently checks that all previously allocated
blocks of this file has reached their minimum replication factor. I plan on removing this
check altogether. This is not really required but is a performance optimization.

4. The close() method on the DFSClient waits for the minimum number of replicas of each block
to be created. This code remains unchanged.

> NNBench generates millions of NotReplicatedYetException in Namenode log
> -----------------------------------------------------------------------
>                 Key: HADOOP-1093
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1093
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.0
>            Reporter: Nigel Daley
>         Assigned To: dhruba borthakur
>             Fix For: 0.13.0
> Running NNBench on latest trunk (0.12.1 candidate) on a few hundred nodes yielded 2.3
million of these exceptions in the NN log:
>    2007-03-08 09:23:03,053 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on
8020 call error:
>    org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated yet
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:803)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:309)
>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)
> I run NNBench to create files with block size set to 1 and replication set to 1.  NNBench
then writes 1 byte to the file.  Minimum replication for the cluster is the default, ie 1.
 If it encounters an exception while trying to do either the create or write operations, it
loops and tries again.  Multiply this by 1000 files per node and a few hundred nodes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message