Return-Path: Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: (qmail 42730 invoked from network); 30 Jan 2011 15:40:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Jan 2011 15:40:52 -0000 Received: (qmail 65154 invoked by uid 500); 30 Jan 2011 15:40:52 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 64762 invoked by uid 500); 30 Jan 2011 15:40:49 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 64754 invoked by uid 99); 30 Jan 2011 15:40:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Jan 2011 15:40:47 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.161.41 as permitted sender) Received: from [209.85.161.41] (HELO mail-fx0-f41.google.com) (209.85.161.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Jan 2011 15:40:41 +0000 Received: by fxm12 with SMTP id 12so5819689fxm.14 for ; Sun, 30 Jan 2011 07:40:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=Mo1UevuGJP0VSo863SgziSmxEoNKEG2QBaWm2sGFfWQ=; b=USowfFA38JKBZs39WyBoBG1eBeSOa4cnsekKmZzA1sY/+rvMmFLaiErfHY+nvitHE4 eTs0Nba4EvEhbIgxgHz0SN8aK7N7yak/N25xsF5V7omq/4G/JaAYpoKlJs/dDndIotXf 5JBFJSshHJ8fIqqXnj24YWSQrzdX2EbnSUq0I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=fHtoWdKhtc3kubS/BHQM9NGQQQdjPeNN6HznbEtIaZhckQkFIXPzqo6FE0jIAZ4qBY F6XZrlBvrp7uhC1xOdF6lOx0RSGKbLAs+fH98dDC+n+IvAm9giPAmmq6fwGFsvFDtjc2 aKGT440dn9ipvvMGSzpi4vYbsNQEltikDr2sw= MIME-Version: 1.0 Received: by 10.223.70.142 with SMTP id d14mr4881467faj.110.1296402020839; Sun, 30 Jan 2011 07:40:20 -0800 (PST) Received: by 10.223.78.140 with HTTP; Sun, 30 Jan 2011 07:40:20 -0800 (PST) In-Reply-To: References: Date: Sun, 30 Jan 2011 07:40:20 -0800 Message-ID: Subject: Re: HBASE-3234 and bad datanode error From: Ted Yu To: dev@hbase.apache.org, kuang.hairong@gmail.com Content-Type: multipart/alternative; boundary=00248c0ef2f2f03109049b121dc9 X-Virus-Checked: Checked by ClamAV on apache.org --00248c0ef2f2f03109049b121dc9 Content-Type: text/plain; charset=ISO-8859-1 Datanode log snippet can be found here: http://pastebin.com/Q555XdVU Here is reducer log snippet: http://pastebin.com/a7RBq5aa Since cdh3b2 doesn't contain hdfs-724, I am not sure whether Hairong's patch (https://issues.apache.org/jira/secure/attachment/12459664/hbAckReply.patch) should be applied. If someone can share how hadoop-core-0.20-append-r1056497.jar (with fixed hdfs-724) is used with their hadoop cluster, that would be great. On Mon, Jan 24, 2011 at 4:58 PM, Ted Yu wrote: > Hi, > Running 0.90 in dev cluster where I used cdh3b2 hadoop jar, I frequently > saw the following in reduce task log: > > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24 22:55:39,009 > INFO com.carrieriq.m2m.platform.mmp3.output.DimensionMapper: Total > requets=15523640 cache hit ratio=0.84543097 avg time=90.1465879780713 > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24 23:17:03,216 > WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor > exception for block blk_8207645655823156697_2836871java.io.IOException: Bad > response 1 for block blk_8207645655823156697_2836871 from datanode > 10.202.50.71:50010 > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2497) > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24 23:17:03,217 > WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block > blk_8207645655823156697_2836871 bad datanode[1] 10.202.50.71:50010 > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24 23:17:03,217 > WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block > blk_8207645655823156697_2836871 in pipeline 10.202.50.78:50010, > 10.202.50.71:50010: bad datanode 10.202.50.71:50010 > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24 23:17:03,252 > INFO org.apache.hadoop.ipc.Client: Retrying connect to server: / > 10.202.50.78:50020. Already tried 0 time(s). > INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24 23:27:27,931 > WARN org.apache.hadoop.mapred.TaskRunner: Parent died. Exiting > > HDFS-895 is in > http://archive.cloudera.com/cdh/3/hadoop-0.20.2+320.releasenotes.html > > Expert opinion on what I saw is appreciated. > --00248c0ef2f2f03109049b121dc9--