hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
Date Wed, 10 Dec 2014 06:54:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240737#comment-14240737

stack commented on HBASE-10201:

[~jeffreyz] When you say...

bq. ... This issue may only happen in 0.98 though.

because we are not doing DLR in 0.98 or for some other reason?  This patch is unlikely to
make it back to 0.98 I'd say.

On the fix for 1.) above, hfiles, will be written out with the stores flushed seqid but we
will tell keep on telling master the oldest unflushed edit (oldestUnflushedSeqId).  Since
flush policies can return any set of Stores without regard to sequenceid, we could have edits
in memstores with sequenceids that are in earlier than those of persisted hfiles.  Since telling
the master oldestUnflushedSeqId does not guarantee that oldestUnflushedSeqId will be available
at recovery time (it is in the master memory only IIRC, and master may crash and lose it),
when region opens post-recovery, we look at sequenceids from hfiles to figure the regions
sequenceid.  Will this mean we drop edits because region thinks its sequenceid is higher than
it should be?

3. is a 'known' cost.  Good to know that DLR won't have this issue.

4. is a good point (as is 2.)

> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Ted Yu
>            Assignee: zhangduo
>             Fix For: 1.0.0, 2.0.0
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch,
HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, HBASE-10201.patch, HBASE-10201_1.patch,
HBASE-10201_10.patch, HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, HBASE-10201_13.patch,
HBASE-10201_14.patch, HBASE-10201_15.patch, HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch,
HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch,
HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png
> Currently the flush decision is made using the aggregate size of all column families.
When large and small column families co-exist, this causes many small flushes of the smaller
CF. We need to make per-CF flush decisions.

This message was sent by Atlassian JIRA

View raw message