hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Fox <r...@connexity.com>
Subject Re: Slow region moves
Date Thu, 15 Oct 2015 18:29:22 GMT
Here is the region server log distilled down to events pertinent to the region move.  Not much
being logged while the Store is being closed.
This region was about 1.4G compressed, which I consider small.

2015-10-15 08:34:28,510 INFO org.apache.hadoop.hbase.regionserver.RSRpcServices: Close dad6c71ed395df19220ef1056a110086,
moving to hb17.prod1.connexity.net,60020,1444425924021
2015-10-15 08:34:28,510 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
Processing close of Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:28,511 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.:
disabling compactions & flushes
2015-10-15 08:34:28,511 INFO org.apache.hadoop.hbase.regionserver.HRegion: Running close preflush
of Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:28,511 INFO org.apache.hadoop.hbase.regionserver.HRegion: Started memstore
flush for Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.,
current region memstore size 44.89 MB, and 1/1 column families' memstores are being flushed.
2015-10-15 08:34:28,511 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: Couldn't find
oldest seqNum for the region we are about to flush: [dad6c71ed395df19220ef1056a110086]
2015-10-15 08:34:29,137 INFO org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
sequenceid=70839856845, memsize=44.9 M, hasBloomFilter=false, into tmp file hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/3600b47839a945de9733cf17581458e0
2015-10-15 08:34:29,144 DEBUG org.apache.hadoop.hbase.regionserver.HRegionFileSystem: Committing
store file hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/3600b47839a945de9733cf17581458e0
as hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/3600b47839a945de9733cf17581458e0
2015-10-15 08:34:29,150 INFO org.apache.hadoop.hbase.regionserver.HStore: Added hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/3600b47839a945de9733cf17581458e0,
entries=190131, sequenceid=70839856845, filesize=3.2 M
2015-10-15 08:34:29,151 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
flush of ~44.89 MB/47066832, currentsize=22.39 KB/22928 for region Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
in 640ms, sequenceid=70839856845, compaction requested=false
2015-10-15 08:34:29,152 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled
for region Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:34:29,152 INFO org.apache.hadoop.hbase.regionserver.HRegion: Started memstore
flush for Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.,
current region memstore size 22.39 KB, and 1/1 column families' memstores are being flushed.
2015-10-15 08:34:29,152 WARN org.apache.hadoop.hbase.regionserver.wal.FSHLog: Couldn't find
oldest seqNum for the region we are about to flush: [dad6c71ed395df19220ef1056a110086]
2015-10-15 08:34:29,225 INFO org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
sequenceid=70839856876, memsize=22.4 K, hasBloomFilter=false, into tmp file hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/a8a74549c7a44174b105cbed23cfaea1
2015-10-15 08:34:29,231 DEBUG org.apache.hadoop.hbase.regionserver.HRegionFileSystem: Committing
store file hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/.tmp/a8a74549c7a44174b105cbed23cfaea1
as hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/a8a74549c7a44174b105cbed23cfaea1
2015-10-15 08:34:29,279 INFO org.apache.hadoop.hbase.regionserver.HStore: Added hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/L/a8a74549c7a44174b105cbed23cfaea1,
entries=128, sequenceid=70839856876, filesize=2.6 K
2015-10-15 08:34:29,280 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
flush of ~22.39 KB/22928, currentsize=0 B/0 for region Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
in 128ms, sequenceid=70839856876, compaction requested=true


2015-10-15 08:37:02,945 INFO org.apache.hadoop.hbase.regionserver.HStore: Closed L
2015-10-15 08:37:02,954 DEBUG org.apache.hadoop.hbase.wal.WALSplitter: Wrote region seqId=hdfs://woz/hbase/data/default/Wildfire_graph3/dad6c71ed395df19220ef1056a110086/recovered.edits/70839856879.seqid
to file, newSeqId=70839856879, maxSeqId=70838935131
2015-10-15 08:37:02,955 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
2015-10-15 08:37:02,955 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Adding moved
region record: dad6c71ed395df19220ef1056a110086 to hb17.prod1.connexity.net,60020,1444425924021
as of 70838935131
2015-10-15 08:37:02,956 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x45049a3c1b80a8b,
quorum=ccm2.prod1.connexity.net:2181,ccm1.prod1.connexity.net:2181,db2.prod1.connexity.net:2181,sql2.prod1.connexity.net:2181,ccm3.prod1.connexity.net:2181,
baseZNode=/hbase Transitioning dad6c71ed395df19220ef1056a110086 from M_ZK_REGION_CLOSING to
RS_ZK_REGION_CLOSED
2015-10-15 08:37:02,967 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x45049a3c1b80a8b,
quorum=ccm2.prod1.connexity.net:2181,ccm1.prod1.connexity.net:2181,db2.prod1.connexity.net:2181,sql2.prod1.connexity.net:2181,ccm3.prod1.connexity.net:2181,
baseZNode=/hbase Transitioned node dad6c71ed395df19220ef1056a110086 from M_ZK_REGION_CLOSING
to RS_ZK_REGION_CLOSED
2015-10-15 08:37:02,967 DEBUG org.apache.hadoop.hbase.coordination.ZkCloseRegionCoordination:
Set closed state in zk for Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
on hb18.prod1.connexity.net,60020,1444427168636
2015-10-15 08:37:02,967 DEBUG org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
Closed Wildfire_graph3,\x00)"y\x1B\xF0\x13-,1434402364692.dad6c71ed395df19220ef1056a110086.
~                                                                                        
                                                                                         
                






On 10/15/15, 11:05 AM, "Esteban Gutierrez" <esteban@cloudera.com> wrote:

>Hi Randy,
>
>As Ted suggested a couple of stack traces would be helpful but if you could
>attach the HMaster and a couple of RS logs where you see this problem that
>would be great. Also, do you have a large regions that might need some time
>to flush? have you looked into any potential problem in HDFS?
>
>thanks,
>esteban.
>
>--
>Cloudera, Inc.
>
>
>On Thu, Oct 15, 2015 at 10:39 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Can you give a bit more detail on why block eviction was cause for the
>> slow region movement?
>>
>> Did you happen to take stack traces ?
>>
>> Thanks
>>
>> > On Oct 15, 2015, at 10:32 AM, Randy Fox <rfox@connexity.com> wrote:
>> >
>> > Hi,
>> >
>> > We just upgraded from 0.94 to 1.0.0 and have noticed that region moves
>> are super slow (order of minutes) whereas previously they where in the
>> seconds range.  After looking at the code, I think the time is spent
>> waiting for the blocks to be evicted from block cache.
>> >
>> > I wanted to verify that this theory is correct and see if there is
>> anything that can be done to speed up the moves.
>> >
>> > This is particular painful as we are trying to get our configs tuned to
>> the new SW and need to do rolling restarts which is taking almost 24 hours
>> on our cluster.  We also do our own manual rebalancing of regions across
>> RS’s and that task is also now painful.
>> >
>> >
>> > Thanks,
>> >
>> > Randy
>>
Mime
View raw message