hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7667) Support stripe compaction
Date Mon, 11 Feb 2013 17:57:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13575936#comment-13575936
] 

Sergey Shelukhin commented on HBASE-7667:
-----------------------------------------

bq. It may be a follow on to this jira, but having "striper" dynamically add stripes at the
end of the region would let allow all the stripes before the last one "go cold" which is critical
for avoiding hugely wasteful compactions of non-changing data
Actually, it can be added as part of the main work, HBASE-7679 (file management) code includes
such capabilities. 
I wonder how, no matter the compactions, does region management work for such scenario. Wouldn't
all the load always be on last region if you have TS keys?
Or, if you have artificial partitioning but query by TS, wouldn't all queries go to all servers?

bq.  To major compact a stripe, all L0 files, if any, can be split into stripes, then merge
all files belonging to the stripe.
Can you explain more about the delete marker limitation?
Suppose in current compaction selection, I choose a set of files starting at the oldest file
but not including all files.
Wouldn't that be enough to process delete markers that delete the updates within those files?
Granted, I might not process all delete markers, but I don't have to see all files. E.g. if
I only have 3 files with one entry for K each, "K=V", "delete K", "K=V2", and I compact the
first two, I can remove entries for K from them, right?

bq. 1. Fixed configs : in the same way that we got a lot of stability by limiting the regions/server
to a fixed number, we might want to similarly limit the number of stripes per region to 10
(or X) instead of "every Y bytes". This will help us understand the benefit we get from striping
and it's easy to double the striping and chart the difference.
That is the original idea.

Thanks for other comments :)
                
> Support stripe compaction
> -------------------------
>
>                 Key: HBASE-7667
>                 URL: https://issues.apache.org/jira/browse/HBASE-7667
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> So I was thinking about having many regions as the way to make compactions more manageable,
and writing the level db doc about how level db range overlap and data mixing breaks seqNum
sorting, and discussing it with Jimmy, Matteo and Ted, and thinking about how to avoid Level
DB I/O multiplication factor.
> And I suggest the following idea, let's call it stripe compactions. It's a mix between
level db ideas and having many small regions.
> It allows us to have a subset of benefits of many regions (wrt reads and compactions)
without many of the drawbacks (managing and current memstore/etc. limitation).
> It also doesn't break seqNum-based file sorting for any one key.
> It works like this.
> The region key space is separated into configurable number of fixed-boundary stripes
(determined the first time we stripe the data, see below).
> All the data from memstores is written to normal files with all keys present (not striped),
similar to L0 in LevelDb, or current files.
> Compaction policy does 3 types of compactions.
> First is L0 compaction, which takes all L0 files and breaks them down by stripe. It may
be optimized by adding more small files from different stripes, but the main logical outcome
is that there are no more L0 files and all data is striped.
> Second is exactly similar to current compaction, but compacting one single stripe. In
future, nothing prevents us from applying compaction rules and compacting part of the stripe
(e.g. similar to current policy with rations and stuff, tiers, whatever), but for the first
cut I'd argue let it "major compact" the entire stripe. Or just have the ratio and no more
complexity.
> Finally, the third addresses the concern of the fixed boundaries causing stripes to be
very unbalanced.
> It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the results
out with different boundaries.
> There's a tradeoff here - if we always take 2 adjacent stripes, compactions will be smaller
but rebalancing will take ridiculous amount of I/O.
> If we take many stripes we are essentially getting into the epic-major-compaction problem
again. Some heuristics will have to be in place.
> In general, if, before stripes are determined, we initially let L0 grow before determining
the stripes, we will get better boundaries.
> Also, unless unbalancing is really large we don't need to rebalance really.
> Obviously this scheme (as well as level) is not applicable for all scenarios, e.g. if
timestamp is your key it completely falls apart.
> The end result:
> - many small compactions that can be spread out in time.
> - reads still read from a small number of files (one stripe + L0).
> - region splits become marvelously simple (if we could move files between regions, no
references would be needed).
> Main advantage over Level (for HBase) is that default store can still open the files
and get correct results - there are no range overlap shenanigans.
> It also needs no metadata, although we may record some for convenience.
> It also would appear to not cause as much I/O.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message