hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files
Date Thu, 13 Oct 2011 22:44:12 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jean-Daniel Cryans updated HBASE-4579:
--------------------------------------

    Summary: CST.requestCompaction semantics changed, logs are now spammed when too many store
files  (was: Compactions that can't run are requested forever)

I spent some time debugging this and the debug I added really helped:

{quote}
2011-10-13 22:04:16,633 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large
Compaction requested: regionName=TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6.,
storeName=info, fileCount=4, fileSize=219.6m (54.8m, 55.0m, 55.0m, 54.8m), priority=-1, time=6062002855079899;
Because: regionserver62023.cacheFlusher; compaction_queue=(1:0), split_queue=0
2011-10-13 22:04:21,999 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6.
2011-10-13 22:04:21,999 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region
TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6. has too many store files;
delaying flush up to 90000ms
2011-10-13 22:04:21,999 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files
to compact 1724203bae0dda87cddfaac9700b9dc6
2011-10-13 22:04:22,000 ERROR org.apache.hadoop.hbase.regionserver.Store:
java.lang.Exception
        at org.apache.hadoop.hbase.regionserver.Store.compactSelection(Store.java:1007)
        at org.apache.hadoop.hbase.regionserver.Store.requestCompaction(Store.java:908)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:204)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:177)
        at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:360)
        at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:239)
...
2011-10-13 22:05:52,584 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Waited
90585ms on a compaction to clean up 'too many store files'; waited long enough... proceeding
with flush of TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6.
{quote}

The exception is there just to show me where the call was coming from.

So what we see is that compaction is grabbing all the store files and during that time the
MemStoreFlusher is trying to flush but hits the "too many store files" issue. It used to be
that requesting a compaction was just putting something on a queue silently, but now it's
actually going all the way down into Store and that's why it's now spammy.

Fixes I think are required:

 - For debugging purposes, show the encoded name when printing the "info: no store files to
compact" message because it's pretty much useless if you don't know what it belongs to.
 - Require the compaction only once, the code already knows when it's the first time you spin
so no need to ask it again and again.
 - Set a higher priority for the compaction
                
> CST.requestCompaction semantics changed, logs are now spammed when too many store files
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-4579
>                 URL: https://issues.apache.org/jira/browse/HBASE-4579
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: Jean-Daniel Cryans
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> Another bug I'm not so sure what's going on. I see this in my log:
> {quote}
> 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store
files to compact
> 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store
files to compact
> 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store
files to compact
> 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store
files to compact
> 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store
files to compact
> 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store
files to compact
> {quote}
> It spams for a while, and a little later instead I get:
> {quote}
> 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction
of info.  Only 2 file(s) of size 176.4m have met compaction criteria.
> 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction
of info.  Only 2 file(s) of size 176.4m have met compaction criteria.
> 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction
of info.  Only 2 file(s) of size 176.4m have met compaction criteria.
> 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction
of info.  Only 2 file(s) of size 176.4m have met compaction criteria.
> 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction
of info.  Only 2 file(s) of size 176.4m have met compaction criteria.
> 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction
of info.  Only 2 file(s) of size 176.4m have met compaction criteria.
> {quote}
> I believe I also saw something like that for flushes, but the region was closing so at
least I know why it was spamming (would be nice if it just unrequested the flush):
> {quote}
> 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,921 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,922 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.
> 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing
memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5.,
flushing=false, writesEnabled=false
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message