hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12859) Major compaction completion tracker
Date Sun, 25 Jan 2015 00:26:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290904#comment-14290904

Lars Hofhansl commented on HBASE-12859:

Note that if you call this from a region observer you do want to reach out to other region
servers (that is a recipe for cluster wide deadlocks when not enough handler are free).
So you'll only be able to find out the last compaction time for the observer's region, and
hence you still won't know whether you're done with the table in question.

There are two options:
# store metadata about work to do per region, and then remove it as each region is major compacted
# store metadata about work to do per table, then you need to be able to track whether all
regions of the table have been major compacted

For #1 one have to keep track of splits and split the "work to do" as well. #2 needs a global
view across region servers.

The implementation here does not prescribe what to do
Hence in PHOENIX-1590 we could the following when dropping a view with deferred delete:
# enumerate all regions. But a work item for each region in some table. During compaction
check that table, and after compaction remove any entries for regions that a completely done.
# just remember the metadata for the view with some "deferred delete" marker added. During
compaction check that. Then one would with each region server when the same view is created
again (to check whether it's OK or not) and also periodically from a client with a global
view (that does not need to happen often, as it just cleans up  soft-delete rows.

> Major compaction completion tracker
> -----------------------------------
>                 Key: HBASE-12859
>                 URL: https://issues.apache.org/jira/browse/HBASE-12859
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>         Attachments: 12859-v1.txt, 12859-v2.txt, 12859-v3.txt, 12859-wip-UNFINISHED.txt
> In various scenarios it is helpful to know a guaranteed timestamp up to which all data
in a table was major compacted.
> We can do that keeping a major compaction timestamp in META.
> A client then can iterate all region of a table and find a definite timestamp, which
is the oldest compaction timestamp of any of the regions.
> [~apurtell], [~ghelmling], [~giacomotaylor].

This message was sent by Atlassian JIRA

View raw message