hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1108) Checksumed file system should retry reading if a different replica is found when handle ChecksumException
Date Mon, 12 Mar 2007 18:25:09 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hairong Kuang updated HADOOP-1108:
----------------------------------

       Assignee: Hairong Kuang  (was: dhruba borthakur)
    Description: Currently there is bug in the code where a checksumed file system throws
an exception if a different replica is found but retry otherwise when handle ChecksumException.
 (was: Running NNBench on latest trunk (0.12.1 candidate) on a few hundred nodes yielded 2.3
million of these exceptions in the NN log:

   2007-03-08 09:23:03,053 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020
call error:
   org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated yet
        at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:803)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:309)
        at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559)

I run NNBench to create files with block size set to 1 and replication set to 1.  NNBench
then writes 1 byte to the file.  Minimum replication for the cluster is the default, ie 1.
 If it encounters an exception while trying to do either the create or write operations, it
loops and tries again.  Multiply this by 1000 files per node and a few hundred nodes.
)
        Summary: Checksumed file system should  retry reading if a different replica is found
when handle ChecksumException  (was: CLONE -NNBench generates millions of NotReplicatedYetException
in Namenode log)

> Checksumed file system should  retry reading if a different replica is found when handle
ChecksumException
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1108
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1108
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.0
>            Reporter: dhruba borthakur
>         Assigned To: Hairong Kuang
>            Priority: Blocker
>             Fix For: 0.12.1
>
>         Attachments: notyetreplciatedexception.patch
>
>
> Currently there is bug in the code where a checksumed file system throws an exception
if a different replica is found but retry otherwise when handle ChecksumException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message