Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <5865784.1174686213012.JavaMail.jira@brutus>
Date: Fri, 23 Mar 2007 14:43:33 -0700 (PDT)
From: "dhruba borthakur (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Updated: (HADOOP-1093) NNBench generates millions of
 NotReplicatedYetException in Namenode log
In-Reply-To: <1726441.1173389664163.JavaMail.root@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HADOOP-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1093:
-------------------------------------

    Attachment: notyetreplicated.patch

First draft of code. Please review. 

I also made the following optimization:
In the earlier code, the datanode was first writing the data to its local file before forwarding it to the next datanode in the pipeline. This patch changes this behaviour. A datanode first writes the data to the next datanode and then writes it to the local block. This should improve performance.

> NNBench generates millions of NotReplicatedYetException in Namenode log
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-1093
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1093
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.0
>            Reporter: Nigel Daley
>         Assigned To: dhruba borthakur
>             Fix For: 0.13.0
>
>         Attachments: notyetreplicated.patch
>
>
> Running NNBench on latest trunk (0.12.1 candidate) on a few hundred nodes yielded 2.3 million of these exceptions in the NN log:
>    2007-03-08 09:23:03,053 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020 call error:
>    org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated yet
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:803)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:309)
>         at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)
> I run NNBench to create files with block size set to 1 and replication set to 1.  NNBench then writes 1 byte to the file.  Minimum replication for the cluster is the default, ie 1.  If it encounters an exception while trying to do either the create or write operations, it loops and tries again.  Multiply this by 1000 files per node and a few hundred nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.