hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heng Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)
Date Tue, 07 Jul 2015 10:45:05 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Heng Chen updated HBASE-2236:
    Assignee:     (was: Heng Chen)

> Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)
> ------------------------------------------------------------------------------
>                 Key: HBASE-2236
>                 URL: https://issues.apache.org/jira/browse/HBASE-2236
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, wal
>            Reporter: stack
>            Priority: Critical
>              Labels: moved_from_0_20_5
> So hbase-2053 is not aggressive enough.  WALs can still overwhelm the upper limit on
log count.  While the code added by HBASE-2053, when done, will ensure we let go of the oldest
WAL, to do it, we might have to flush many regions.  E.g:
> {code}
> 2010-02-15 14:20:29,351 INFO org.apache.hadoop.hbase.regionserver.HLog: Too many hlogs:
logs=45, maxlogs=32; forcing flush of 5 regions(s): test1,193717,1266095474624, test1,194375,1266108228663,
test1,195690,1266095539377, test1,196348,1266095539377, test1,197939,1266069173999
> {code}
> This takes time.  If we are taking on edits a furious rate, we might have rolled the
log again, meantime, maybe more than once.
> Also log rolls happen inline with a put/delete as soon as it hits the 64MB (default)
boundary whereas the necessary flushing is done in background by a single thread and the memstore
can overrun the (default) 64MB size.  Flushes needed to release logs will be mixed in with
"natural" flushes as memstores fill.  Flushes may take longer than the writing of an HLog
because they can be larger.
> So, on an RS that is struggling the tendency would seem to be for a slight rise in WALs.
 Only if the RS gets a breather will the flusher catch up.
> If HBASE-2087 happens, then the count of WALs get a boost.
> Ideas to fix this for good would be :
> + Priority queue for queuing up flushes with those that are queued to free up WALs having
> + Improve the HBASE-2053 code so that it will free more than just the last WAL, maybe
even queuing flushes so we clear all WALs such that we are back under the maximum WALS threshold

This message was sent by Atlassian JIRA

View raw message