Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 11519 invoked from network); 9 Mar 2007 22:51:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Mar 2007 22:51:31 -0000 Received: (qmail 31581 invoked by uid 500); 9 Mar 2007 22:51:39 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 31552 invoked by uid 500); 9 Mar 2007 22:51:39 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 31519 invoked by uid 99); 9 Mar 2007 22:51:39 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2007 14:51:39 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Mar 2007 14:51:29 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A8E2371406F for ; Fri, 9 Mar 2007 14:51:09 -0800 (PST) Message-ID: <5805436.1173480669688.JavaMail.jira@brutus> Date: Fri, 9 Mar 2007 14:51:09 -0800 (PST) From: "Hairong Kuang (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1093) NNBench generates millions of NotReplicatedYetException in Namenode log In-Reply-To: <1726441.1173389664163.JavaMail.root@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12479753 ] Hairong Kuang commented on HADOOP-1093: --------------------------------------- Ok, after I talked with Dhruba, I got some background information on this issue. The first piece of code is not a problem. The second piece of the code is a bug. The check should be reversed. But I do not think this will cause an infinite loop because the max # of retries is 3. > NNBench generates millions of NotReplicatedYetException in Namenode log > ----------------------------------------------------------------------- > > Key: HADOOP-1093 > URL: https://issues.apache.org/jira/browse/HADOOP-1093 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.12.0 > Reporter: Nigel Daley > Assigned To: dhruba borthakur > Priority: Blocker > Fix For: 0.12.1 > > > Running NNBench on latest trunk (0.12.1 candidate) on a few hundred nodes yielded 2.3 million of these exceptions in the NN log: > 2007-03-08 09:23:03,053 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020 call error: > org.apache.hadoop.dfs.NotReplicatedYetException: Not replicated yet > at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:803) > at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:309) > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:336) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:559) > I run NNBench to create files with block size set to 1 and replication set to 1. NNBench then writes 1 byte to the file. Minimum replication for the cluster is the default, ie 1. If it encounters an exception while trying to do either the create or write operations, it loops and tries again. Multiply this by 1000 files per node and a few hundred nodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.