hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10492) open daughter regions can unpredictably take long time
Date Tue, 11 Feb 2014 22:25:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898410#comment-13898410
] 

stack commented on HBASE-10492:
-------------------------------

6minutes is eons, a body blow.  If you run oracle jvm, does it exhibit same latencies?

ibm jdk taking this long scheduling a thread is a problem, a problem for us if we are to run
well on ibmjdk.   We should dig and figure if it something about the way this thread is scheduled
in particular or if it is the case that any thread can be swapped out for pauses of this magnitude.

> open daughter regions can unpredictably take long time
> ------------------------------------------------------
>
>                 Key: HBASE-10492
>                 URL: https://issues.apache.org/jira/browse/HBASE-10492
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.96.0
>            Reporter: Jerry He
>
> I have seen during a stress testing client was getting 
> "RetriesExhaustedWithDetailsException: Failed 748 actions: NotServingRegionException"
> On the master log, 2014-02-08 20:43 is the timestamp from OFFLINE to SPLITTING_NEW, 2014-02-08
21:41 is the timestamp from SPLITTING_NEW to OPEN.
> The corresponding time period on the region sever log is:
> {code}
> 2014-02-08 20:44:12,662 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem:
.regioninfo file not found for region: 010c1981882d1a59201af5e2dc589d44
> 2014-02-08 20:44:12,666 WARN org.apache.hadoop.hbase.regionserver.HRegionFileSystem:
.regioninfo file not found for region: c2eb9b7971ca7f3fed3da86df5b788e7
> {code}
> There were no INFO related to these two regions until: (at the end see this: Split took
57mins, 16sec)
> {code}
> 2014-02-08 21:41:14,029 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined c2eb9b7971ca7f3fed3da86df5b788e7;
next sequenceid=213355
> 2014-02-08 21:41:14,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Onlined 010c1981882d1a59201af5e2dc589d44;
next sequenceid=213354
> 2014-02-08 21:41:14,032 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post
open deploy tasks for region=tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
> 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row
tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
with server=hdtest208.svl.ibm.com,60020,1391887547473
> 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished
post open deploy task for tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
> 2014-02-08 21:41:14,054 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Post
open deploy tasks for region=tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Completed compaction
of 10 file(s) in cf of tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.
into 451be6df8c604993ae540b808d9cfa08(size=72.8 M), total size for store is 2.4 G. This selection
was in queue for 0sec, and took 1mins, 40sec to execute.
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Completed compaction: Request = regionName=tpch_hb_1000_2.lineitem,^\x01\x8B\xE7(\x80\x01\x80\x93\xFD\x01\x01\x80\x00\x00\x00\xB5\x0E\xCC'\x01\x80\x00\x00\x03,1391918508561.1fbcfc0a792435dfd73ec5b0ef5c953c.,
storeName=cf, fileCount=10, fileSize=94.1 M, priority=9883, time=1391924373278861000; duration=1mins,
40sec
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction
on cf in region tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
> 2014-02-08 21:41:14,059 INFO org.apache.hadoop.hbase.regionserver.HStore: Starting compaction
of 10 file(s) in cf of tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7.
into tmpdir=gpfs:/hbase/data/default/tpch_hb_1000_2.lineitem/c2eb9b7971ca7f3fed3da86df5b788e7/.tmp,
totalSize=709.7 M
> 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Updated row
tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44. with server=hdtest208.svl.ibm.com,60020,1391887547473
> 2014-02-08 21:41:14,066 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Finished
post open deploy task for tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44.
> 2014-02-08 21:41:14,190 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Region
split, hbase:meta updated, and report to master. Parent=tpch_hb_1000_2.lineitem,,1391918508561.b576e8db65d56ec08db5ca900587c28d.,
new regions: tpch_hb_1000_2.lineitem,,1391921037353.010c1981882d1a59201af5e2dc589d44., tpch_hb_1000_2.lineitem,]\x01\x8B\xE9\xF4\x8A\x01\x80p\xA3\xA4\x01\x80\x00\x00\x00\xB6\xB7+\x02\x01\x80\x00\x00\x02,1391921037353.c2eb9b7971ca7f3fed3da86df5b788e7..
Split took 57mins, 16sec
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message