hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5312) Closed parent region present in Hlog.lastSeqWritten
Date Fri, 10 Feb 2012 17:53:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205580#comment-13205580
] 

ramkrishna.s.vasudevan commented on HBASE-5312:
-----------------------------------------------

+1 on adding logs but need to ensure we don't slow down the hlog append and flush flow.
By the way Jimmy, you have any suspects in this issue?
                
> Closed parent region present in Hlog.lastSeqWritten
> ---------------------------------------------------
>
>                 Key: HBASE-5312
>                 URL: https://issues.apache.org/jira/browse/HBASE-5312
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.5
>            Reporter: ramkrishna.s.vasudevan
>             Fix For: 0.90.7
>
>
> This is in reference to the mail sent in the dev mailing list
> "Closed parent region present in Hlog.lastSeqWritten".
> The sceanrio described is
> We had a region that was split into two daughters.  When the hlog roll tried to flush
the region there was an entry in the HLog.lastSeqWritten that was not flushed or removed from
the lastSeqWritten during the parent close.
> Because this flush was not happening subsequent flushes were getting blocked
> {code}
>  05:06:44,422 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=122, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:06:44,422 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
>  05:10:48,666 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=123, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:10:48,666 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
>  05:14:46,075 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=124, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:14:46,075 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> requester=null
>  05:15:41,584 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Too many
>  hlogs: logs=125, maxlogs=32; forcing flush of 1 regions(s):
>  2acaf8e3acfd2e8a5825a1f6f0aca4a8
>  05:15:41,584 WARN org.apache.hadoop.hbase.regionserver.LogRoller: Failed
>  to schedule flush of 2acaf8e3acfd2e8a5825a1f6f0aca4a8r=null,
> {code}
> Lets see what happened for the region 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> {code}
> 2012-01-06 00:30:55,214 INFO org.apache.hadoop.hbase.regionserver.Store: Renaming flushed
file at hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/.tmp/1755862026714756815
to hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123
> 2012-01-06 00:30:58,946 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Instantiated
Htable_UFDR_016,049790700093168-04565200000,1325809837958.0ebe5bd7fcbc09ee074d5600b9d4e062.
> 2012-01-06 00:30:59,614 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/2acaf8e3acfd2e8a5825a1f6f0aca4a8/value/973789709483406123,
entries=7537, sequenceid=20312223, memsize=4.2m, filesize=2.9m
> 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished
snapshotting, commencing flushing stores
> 2012-01-06 00:30:59,787 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
flush of ~133.5m for region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
in 21816ms, sequenceid=20312223, compaction requested=true
> 2012-01-06 00:30:59,787 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Compaction requested for Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
because regionserver20020.cacheFlusher; priority=0, compaction queue size=5840
> {code}
> A user triggered split has been issued to this region which can be seen in the above
logs.
> The flushing of this region has resulted in a seq id 20312223.
> The region has been splitted and the parent region has been closed
> {code}
> 00:31:12,607 INFO org.apache.hadoop.hbase.regionserver.SplitTransaction: Starting split
of region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.:
disabling compactions & flushes
> 00:31:13,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for
region Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:13,718 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.
> 00:31:39,552 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region
Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8. in META
> 00:31:41,374 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/259d0c620c9105928e52713f4a5a252e/value/239119195230239381.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
isReference=true, isBulkLoadResult=false, seqid=20201181, majorCompaction=false
> 2012-01-06 00:31:42,529 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/52ff3c7c6df6e0337876bbca29cee84a/value/973789709483406123.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
isReference=true, isBulkLoadResult=false, seqid=20312224, majorCompaction=false
> 2012-01-06 00:31:42,532 DEBUG org.apache.hadoop.hbase.regionserver.Store: loaded hdfs://192.168.1.103:9000/hbase/Htable_UFDR_031/259d0c620c9105928e52713f4a5a252e/value/973789709483406123.2acaf8e3acfd2e8a5825a1f6f0aca4a8,
isReference=true, isBulkLoadResult=false, seqid=20312223, majorCompaction=false
> INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region split, META updated,
and report to master. Parent=Htable_UFDR_031,00332,1325808823997.2acaf8e3acfd2e8a5825a1f6f0aca4a8.,
new regions: Htable_UFDR_031,00332,1325809872607.259d0c620c9105928e52713f4a5a252e., Htable_UFDR_031,003732800093168-03594291912,1325809872607.52ff3c7c6df6e0337876bbca29cee84a..
Split took 29sec
> {code}
> In the above logs we can also see that the new daugher regions have the next seq id as
20312223 and 20312224.
> {code}
> DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Found 1 hlogs to remove out of total
4; oldest outstanding sequenceid is 20312224 from region 2acaf8e3acfd2e8a5825a1f6f0aca4a8
> {code}
> Now we see that the parent region which was clsoed has a seq id 20312224 which is not
flushed.
> So further flush are failing as the region is already removed from onlineRegionList.
> The doubt here is before the region could be closed, a put has arrived for this region.
 But due to some reason the flush has  not happened for that. We tried to dig this, but not
able to get this problem again.
> 0.90.5 version + few 0.90.6 patches ( before 0.90.6RC0)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message