Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates
 209.85.160.181 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <D62B10F9-D00A-483D-9295-01B7E8FF6C5C@digitalenvoy.net>
References: <D62B10F9-D00A-483D-9295-01B7E8FF6C5C@digitalenvoy.net>
Date: Mon, 10 Nov 2014 11:21:48 -0800
Message-ID: 
 <CALte62ycQefaoLOeQwoo+PSuDLe5pjUKNJK-uYESY_350D+JwA@mail.gmail.com>
Subject: Re: what can cause RegionTooBusyException?
From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: multipart/alternative; boundary=001a113a4686f7733405078612f0

--001a113a4686f7733405078612f0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

There could be more than one reason where RegionTooBusyException is thrown.
Below are two (from HRegion):

   * We throw RegionTooBusyException if above memstore limit
   * and expect client to retry using some kind of backoff
  */
  private void checkResources()

   * Try to acquire a lock.  Throw RegionTooBusyException

   * if failed to get the lock in time. Throw InterruptedIOException

   * if interrupted while waiting for the lock.

   */

  private void lock(final Lock lock, final int multiplier)

How many tasks may write to this row concurrently ?

Which 0.98 release are you using ?

Cheers

On Mon, Nov 10, 2014 at 11:10 AM, Brian Jeltema <
brian.jeltema@digitalenvoy.net> wrote:

> I=E2=80=99m running a map/reduce job against a table that is performing a=
 large
> number of writes (probably updating every row).
> The job is failing with the exception below. This is a solid failure; it
> dies at the same point in the application,
> and at the same row in the table. So I doubt it=E2=80=99s a conflict with
> compaction (and the UI shows no compaction in progress),
> or that there is a load-related cause.
>
> =E2=80=98hbase hbck=E2=80=99 does not report any inconsistencies. The
> =E2=80=98waitForAllPreviousOpsAndReset=E2=80=99 leads me to suspect that
> there is operation in progress that is hung and blocking the update. I
> don=E2=80=99t see anything suspicious in the HBase logs.
> The data at the point of failure is not unusual, and is identical to many
> preceding rows.
> Does anybody have any ideas of what I should look for to find the cause o=
f
> this RegionTooBusyException?
>
> This is Hadoop 2.4 and HBase 0.98.
>
> 14/11/10 13:46:13 INFO mapreduce.Job: Task Id :
> attempt_1415210751318_0010_m_000314_1, Status : FAILED
> Error:
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Fail=
ed
> 1744 actions: RegionTooBusyException: 1744 times,
>         at
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(Asy=
ncProcess.java:207)
>         at
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(Async=
Process.java:187)
>         at
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset=
(AsyncProcess.java:1568)
>         at
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:=
1023)
>         at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:995)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:953)
>
> Brian

--001a113a4686f7733405078612f0--