Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1FFEA114DF for ; Mon, 8 Sep 2014 11:11:04 +0000 (UTC) Received: (qmail 90880 invoked by uid 500); 8 Sep 2014 11:11:01 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 90821 invoked by uid 500); 8 Sep 2014 11:11:01 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 90810 invoked by uid 99); 8 Sep 2014 11:11:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Sep 2014 11:11:01 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of brian.jeltema@digitalenvoy.net designates 68.64.43.136 as permitted sender) Received: from [68.64.43.136] (HELO barracuda.digitalenvoy.net) (68.64.43.136) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 08 Sep 2014 11:10:35 +0000 X-ASG-Debug-ID: 1410174632-05f61132146bb580001-ZI2oBf Received: from brian-jeltema.employees.digitalenvoy.net (norc-office.digitalenvoy.net [64.129.218.66]) by barracuda.digitalenvoy.net with ESMTP id inwERPBVvBo3rZt1 (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO) for ; Mon, 08 Sep 2014 07:10:32 -0400 (EDT) X-Barracuda-Envelope-From: brian.jeltema@digitalenvoy.net X-Barracuda-Apparent-Source-IP: 64.129.218.66 X-ASG-Whitelist: Client From: Brian Jeltema Content-Type: multipart/alternative; boundary="Apple-Mail=_F71BCE76-63E0-421D-B924-D11AD8EE9EC2" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: need help understand log output Date: Mon, 8 Sep 2014 07:10:32 -0400 X-ASG-Orig-Subj: Re: need help understand log output References: <3F3F4445-9642-4238-889B-E14B20D66B7B@digitalenvoy.net> To: user@hbase.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1874) X-Barracuda-Connect: norc-office.digitalenvoy.net[64.129.218.66] X-Barracuda-Start-Time: 1410174632 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://barracuda.digitalenvoy.net:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at digitalenvoy.net X-Barracuda-BRTS-Status: 1 X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_F71BCE76-63E0-421D-B924-D11AD8EE9EC2 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 > When number of attempts is greater than the value of > hbase.client.start.log.errors.counter (default 9), AsyncProcess would > produce logs cited below. > The interval following 'retrying after ' is the backoff time. >=20 > Which release of HBase are you using ? >=20 HBase Version 0.98.0.2.1.1.0-385-hadoop2 The MR job is reading from an HBase snapshot, if that=92s relevant. > Cheers >=20 >=20 > On Sun, Sep 7, 2014 at 8:50 AM, Brian Jeltema < > brian.jeltema@digitalenvoy.net> wrote: >=20 >> I have a map/reduce job that is consistently failing with timeouts. = The >> failing mapper log files contain a series >> of records similar to those below. When I look at the hbase and hdfs = logs >> (on foo.net in this case) I don=92t see >> anything obvious at these timestamps. The mapper task times out = at/near >> attempt=3D25/35. Can anyone shed light >> on what these log entries mean? >>=20 >> Thanks - Brian >>=20 >>=20 >> 2014-09-07 09:36:51,421 INFO [htable-pool1-t1] >> org.apache.hadoop.hbase.client.AsyncProcess: #3, table=3DHost, = primary, >> attempt=3D10/35 failed 1062 ops, last exception: null on = foo.net,60020,1406043467187, >> tracking started null, retrying after 10029 ms, replay 1062 ops >> 2014-09-07 09:37:01,642 INFO [htable-pool1-t1] >> org.apache.hadoop.hbase.client.AsyncProcess: #3, table=3DHost, = primary, >> attempt=3D11/35 failed 1062 ops, last exception: null on = foo.net,60020,1406043467187, >> tracking started null, retrying after 10023 ms, replay 1062 ops >> 2014-09-07 09:37:12,064 INFO [htable-pool1-t1] >> org.apache.hadoop.hbase.client.AsyncProcess: #3, table=3DHost, = primary, >> attempt=3D12/35 failed 1062 ops, last exception: null on = foo.net,60020,1406043467187, >> tracking started null, retrying after 20182 ms, replay 1062 ops >> 2014-09-07 09:37:32,708 INFO [htable-pool1-t1] >> org.apache.hadoop.hbase.client.AsyncProcess: #3, table=3DHost, = primary, >> attempt=3D13/35 failed 1062 ops, last exception: null on = foo.net,60020,1406043467187, >> tracking started null, retrying after 20140 ms, replay 1062 ops >> 2014-09-07 09:37:52,940 INFO [htable-pool1-t1] >> org.apache.hadoop.hbase.client.AsyncProcess: #3, table=3DHost, = primary, >> attempt=3D14/35 failed 1062 ops, last exception: null on = foo.net,60020,1406043467187, >> tracking started null, retrying after 20041 ms, replay 1062 ops >> 2014-09-07 09:38:13,324 INFO [htable-pool1-t1] >> org.apache.hadoop.hbase.client.AsyncProcess: #3, table=3DHost, = primary, >> attempt=3D15/35 failed 1062 ops, last exception: null on = foo.net,60020,1406043467187, >> tracking started null, retrying after 20041 ms, replay 1062 ops >>=20 >>=20 --Apple-Mail=_F71BCE76-63E0-421D-B924-D11AD8EE9EC2--