hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: HBASE-3234 and bad datanode error
Date Tue, 01 Feb 2011 06:09:11 GMT
On Mon, Jan 31, 2011 at 10:04 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> J-D:
> Thanks for your kind offer.
> Full output from a reducer contains CIQ proprietary code information which
> I
> cannot disclose.
>
> Also the size of data node logs would be big.
>
> It would be nice if people from Cloudera can take a look. Or they can
> outline their hadoop release schedule which covers hdfs-724 and hdfs-895.
>

Like JD said, you have to provide a lot more data than you're providing.

"Retrying connect" indicates likely network issues, but who knows past that?

Doesn't look like HDFS-724, and we've had HDFS-895 in our build for months
and months.

-Todd


>
> On Mon, Jan 31, 2011 at 11:23 AM, Jean-Daniel Cryans <jdcryans@apache.org
> >wrote:
>
> > The timestamps from those logs don't correlate to the issue you pasted
> > earlier, would it be possible to see all logs from a single instance
> > of the issue? It would make our life much easier helping you.
> >
> > In fact, I would like to see all the logs from all the datanodes plus
> > the full output from a reducer. You could compress it and leave that
> > on a webserver or send it to me directly. What you pasted only gives a
> > very restricted view of what happened.
> >
> > J-D
> >
> > On Sun, Jan 30, 2011 at 7:40 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > Datanode log snippet can be found here:
> > > http://pastebin.com/Q555XdVU
> > >
> > > Here is reducer log snippet:
> > > http://pastebin.com/a7RBq5aa
> > >
> > > Since cdh3b2 doesn't contain hdfs-724, I am not sure whether Hairong's
> > patch
> > > (
> >
> https://issues.apache.org/jira/secure/attachment/12459664/hbAckReply.patch
> > )
> > > should be applied.
> > >
> > > If someone can share how hadoop-core-0.20-append-r1056497.jar (with
> fixed
> > > hdfs-724) is used with their hadoop cluster, that would be great.
> > >
> > > On Mon, Jan 24, 2011 at 4:58 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > >> Hi,
> > >> Running 0.90 in dev cluster where I used cdh3b2 hadoop jar, I
> frequently
> > >> saw the following in reduce task log:
> > >>
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24
> 22:55:39,009
> > >> INFO com.carrieriq.m2m.platform.mmp3.output.DimensionMapper: Total
> > >> requets=15523640 cache hit ratio=0.84543097 avg time=90.1465879780713
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24
> 23:17:03,216
> > >> WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream
> ResponseProcessor
> > >> exception  for block
> blk_8207645655823156697_2836871java.io.IOException:
> > Bad
> > >> response 1 for block blk_8207645655823156697_2836871 from datanode
> > >> 10.202.50.71:50010
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) -       at
> > >>
> >
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2497)
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) -
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24
> 23:17:03,217
> > >> WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
> > >> blk_8207645655823156697_2836871 bad datanode[1] 10.202.50.71:50010
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24
> 23:17:03,217
> > >> WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
> > >> blk_8207645655823156697_2836871 in pipeline 10.202.50.78:50010,
> > >> 10.202.50.71:50010: bad datanode 10.202.50.71:50010
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24
> 23:17:03,252
> > >> INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /
> > >> 10.202.50.78:50020. Already tried 0 time(s).
> > >> INFO [2011-01-24 15:27:39] (ExecUtil.java:258) - 2011-01-24
> 23:27:27,931
> > >> WARN org.apache.hadoop.mapred.TaskRunner: Parent died.  Exiting
> > >>
> > >> HDFS-895 <https://issues.apache.org/jira/browse/HDFS-895> is in
> > >> http://archive.cloudera.com/cdh/3/hadoop-0.20.2+320.releasenotes.html
> > >>
> > >> Expert opinion on what I saw is appreciated.
> > >>
> > >
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message