hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1212) merge tool expects regions all have different sequence ids
Date Mon, 11 May 2009 17:17:45 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-1212:

    Fix Version/s:     (was: 0.20.0)

Thinking on it, this event should be extremely rare.  Sequence ids are monotonically increasing
in a running regionserver.  Across a cluster, two files of the same family would have to end
with same sequenceid.  Then whats the likelihood that of all regions on cluster these are
the two to merge (Merge is a little-used tool to date).

To fix, would need to look at the content of the two files and make a judgement as to which
should come before the other -- which has the most recent edits.  Maybe we could do something
basic like let the file with the largest size prevail over the smaller.  Once we'd figure
which file to bring to the fore, we need to rewrite the hfile so we can change the sequence
id.  Since we're rewriting one of the files at least, might as well compact them.

We could move to modification times.  That should simplify this sequenceid story.  It wouldn't
remove this issue.  We'd still have to figure which store file to favor if two happened to
have same mod time.

In bigtable, chubby owns the storefiles/sstables.  Maybe thats where we should go so we don't
have sequenceids anymore?

Moving out of 0.20.0 because this issue rare and amount of work to address is large.

> merge tool expects regions all have different sequence ids
> ----------------------------------------------------------
>                 Key: HBASE-1212
>                 URL: https://issues.apache.org/jira/browse/HBASE-1212
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
> Currently merging two regions, the merge tool will compare their sequence ids.  If same,
it will decrement one.  It needs to do this because on region open, files are keyed by their
sequenceid; if two the same, one will erase the other.
> Well, with the move to the aggregating hfile format, the sequenceid is written when the
file is created and its no longer written into an aside file but as metadata on to the end
of the file.  Changing the sequenceid is no longer an option.
> This issue is about figuring a solution for the rare case where two store files have
same sequence id AND we want to merge the two regions.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message