Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CC95E17DAA for ; Mon, 10 Nov 2014 19:23:47 +0000 (UTC) Received: (qmail 57770 invoked by uid 500); 10 Nov 2014 19:23:45 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 57696 invoked by uid 500); 10 Nov 2014 19:23:45 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 57684 invoked by uid 99); 10 Nov 2014 19:23:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Nov 2014 19:23:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.160.181 as permitted sender) Received: from [209.85.160.181] (HELO mail-yk0-f181.google.com) (209.85.160.181) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Nov 2014 19:23:19 +0000 Received: by mail-yk0-f181.google.com with SMTP id q200so877254ykb.26 for ; Mon, 10 Nov 2014 11:21:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=qp3vA5reI8sNFSX0hyImg4NTuqfVBPeug4Ro0aUNpns=; b=NwNyLTmxAEXZQyJH0e+vemN1PjEW9GwMMHupLrWIOc1lNwKbPDdTXd9cYwL67EYMKl DjX96N7i7CScitezgFc5jMCQm/JygIjjY3rO7XmD4u76aYdOSPaQffglgBYlI4rGgggk QmCzwl+x9gojPrgY6fCWWjvUZkwHhtE8k1Rhh6SxKyFiPssN8QcjCjm6HoFpJorkRUh1 Xey1JHrW03fNRGHJWwVyUoJytQ7jDjKn22Wknq7Ery9XwY/0QCY7PMUS+eLqYG5PO9d4 y3exhf9OrbHueo4u2U+rNOImAeXGAphGfuxAQhzObfCEW2dYJiD4TaDVNT3cNvuqkGxw esFA== MIME-Version: 1.0 X-Received: by 10.170.100.215 with SMTP id r206mr36791092yka.19.1415647308769; Mon, 10 Nov 2014 11:21:48 -0800 (PST) Received: by 10.170.180.7 with HTTP; Mon, 10 Nov 2014 11:21:48 -0800 (PST) In-Reply-To: References: Date: Mon, 10 Nov 2014 11:21:48 -0800 Message-ID: Subject: Re: what can cause RegionTooBusyException? From: Ted Yu To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a113a4686f7733405078612f0 X-Virus-Checked: Checked by ClamAV on apache.org --001a113a4686f7733405078612f0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable There could be more than one reason where RegionTooBusyException is thrown. Below are two (from HRegion): * We throw RegionTooBusyException if above memstore limit * and expect client to retry using some kind of backoff */ private void checkResources() * Try to acquire a lock. Throw RegionTooBusyException * if failed to get the lock in time. Throw InterruptedIOException * if interrupted while waiting for the lock. */ private void lock(final Lock lock, final int multiplier) How many tasks may write to this row concurrently ? Which 0.98 release are you using ? Cheers On Mon, Nov 10, 2014 at 11:10 AM, Brian Jeltema < brian.jeltema@digitalenvoy.net> wrote: > I=E2=80=99m running a map/reduce job against a table that is performing a= large > number of writes (probably updating every row). > The job is failing with the exception below. This is a solid failure; it > dies at the same point in the application, > and at the same row in the table. So I doubt it=E2=80=99s a conflict with > compaction (and the UI shows no compaction in progress), > or that there is a load-related cause. > > =E2=80=98hbase hbck=E2=80=99 does not report any inconsistencies. The > =E2=80=98waitForAllPreviousOpsAndReset=E2=80=99 leads me to suspect that > there is operation in progress that is hung and blocking the update. I > don=E2=80=99t see anything suspicious in the HBase logs. > The data at the point of failure is not unusual, and is identical to many > preceding rows. > Does anybody have any ideas of what I should look for to find the cause o= f > this RegionTooBusyException? > > This is Hadoop 2.4 and HBase 0.98. > > 14/11/10 13:46:13 INFO mapreduce.Job: Task Id : > attempt_1415210751318_0010_m_000314_1, Status : FAILED > Error: > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Fail= ed > 1744 actions: RegionTooBusyException: 1744 times, > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(Asy= ncProcess.java:207) > at > org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(Async= Process.java:187) > at > org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset= (AsyncProcess.java:1568) > at > org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:= 1023) > at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:995) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:953) > > Brian --001a113a4686f7733405078612f0--