Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
Date: Tue, 1 Apr 2014 15:43:17 +0000 (UTC)
From: "Andrew Purtell (JIRA)" <jira@apache.org>
To: dev@hbase.apache.org
Message-ID: <JIRA.12705843.1396320988072.44224.1396366997640@arcas>
In-Reply-To: <JIRA.12705843.1396320988072@arcas>
References: <JIRA.12705843.1396320988072@arcas>
Subject: [jira] [Resolved] (HBASE-10882) Bulkload process hangs on regions
 randomly and finally throws RegionTooBusyException
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/HBASE-10882?page=3Dcom.atlassi=
an.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-10882.
------------------------------------

    Resolution: Invalid

Please ask for assistance on the user@hbase.apache.org mailing list.

> Bulkload process hangs on regions randomly and finally throws RegionTooBu=
syException
> -------------------------------------------------------------------------=
-----------
>
>                 Key: HBASE-10882
>                 URL: https://issues.apache.org/jira/browse/HBASE-10882
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.94.10
>         Environment: rhel 5.6, jdk1.7.0_45, hadoop-2.2.0-cdh5.0.0
>            Reporter: Victor Xu
>         Attachments: jstack_5105.log
>
>
> I came across the problem in the early morning several days ago. It happe=
ned when I used hadoop completebulkload command to bulk load some hdfs file=
s into hbase table. Several regions hung and after retried three times they=
 all threw RegionTooBusyExceptions. Fortunately, I caught one of the except=
ional region=E2=80=99s HRegionServer process=E2=80=99s jstack info just in =
time.
> I found that the bulkload process was waiting for a write lock:
> at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(Re=
entrantReadWriteLock.java:1115)
> The lock id is 0x00000004054ecbf0.
> In the meantime, many other Get/Scan operations were also waiting for the=
 same lock id. And, of course, they were waiting for the read lock:
> at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(Ree=
ntrantReadWriteLock.java:873)
> The most ridiculous thing is NO ONE OWNED THE LOCK! I searched the jstack=
 output carefully, but cannot find any process who claimed to own the lock.
> When I restart the bulk load process, it failed at different regions but =
with the same RegionTooBusyExceptions.=20
> I guess maybe the region was doing some compactions at that time and owne=
d the lock, but I couldn=E2=80=99t find compaction info in the hbase-logs.
> Finally, after several days=E2=80=99 hard work, the only temporary soluti=
on to this problem was found, that is TRIGGERING A MAJOR COMPACTION BEFORE =
THE BULKLOAD,=20
> So which process owned the lock? Has anyone came across the same problem =
before?


--
This message was sent by Atlassian JIRA
(v6.2#6252)