hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Spiegelberg (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3242) HLog Compactions
Date Wed, 17 Nov 2010 02:05:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932772#action_12932772
] 

Nicolas Spiegelberg commented on HBASE-3242:
--------------------------------------------

IRC communication below.  Highlights:

1. ICV case is highest pri for a bunch of use cases.  It would be simpler to implement.  We
could just address that first
2. Make sure we properly handle replication with any changes
3. 'compacting flush' capability would be another plus while we're digging into compaction
code
------------------------------------------------
[5:21pm] tlipcon: hlog compaction? 
[5:22pm] nspiegelberg:   it's an idea that's been floating around in my head the past couple
weeks.
[5:22pm] tlipcon: makes some sense
[5:22pm] nspiegelberg: the BigTable paper actually mentions that it compacts logs. that got
me started thinking about the idea
[5:22pm] tlipcon: just gonna be tricky... all these heuristics
[5:22pm] dj_ryan: being fully persisistent and high speed are hard to get
[5:23pm] dj_ryan: i was thinking it might be possible to improve the speed of log reply
[5:23pm] dj_ryan: in which case we could have more outstanding logs
[5:23pm] nspiegelberg: well I think the purpose is to efficiently address edge cases
[5:23pm] tlipcon: right, log replay and splitting are both kind of slow
[5:23pm] nspiegelberg: if log entries were uniformly distributed, what we have now is perfect
[5:23pm] tlipcon: nspiegelberg: I wonder how much the "compacting flush" would buy us
[5:24pm] tlipcon: what BT calls minor compactions
[5:24pm] nspiegelberg: that's another idea that jgray advocates
[5:25pm] nspiegelberg: really, we can implement a trivial HLog compaction that is only usefull
for ICV applications and it would be greatly beneficial for us
[5:33pm] apurtell: "a trivial HLog compaction that is only usefull for ICV applications" --
seems a good start, we'd find that useful and so would stumble i believe 
[5:35pm] nspiegelberg: we already have practical need both cases for HLog compaction, but
the ICV application is definitely higher priority.
[5:38pm] nspiegelberg: yeah, the only thing I haven't researched is replication impact.  I
imagine that we could handle HLog compactions independently on each cluster.  Then, flag the
compacted HLogs and just not send them to the replica cluster.
[5:39pm] dj_ryan: we might want some jd input
[5:39pm] dj_ryan: but basically there is a 'read point' in a hlog for the replication sender
[5:39pm] dj_ryan: so if you are compacting stuff that was already sent, we'll be ok
[5:39pm] dj_ryan: and at the target, they'd do similiar things i guess
[5:40pm] nspiegelberg: definitely.  RFC.  I have migration woes right now, but I wanted to
get the idea out there and have it running through ppls heads 


> HLog Compactions
> ----------------
>
>                 Key: HBASE-3242
>                 URL: https://issues.apache.org/jira/browse/HBASE-3242
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>
> Currently, our memstore flush algorithm is pretty trivial.  We let it grow to a flushsize
and flush a region or grow to a certain log count and then flush everything below a seqid.
 In certain situations, we can get big wins from being more intelligent with our memstore
flush algorithm.  I suggest we look into algorithms to intelligently handle HLog compactions.
 By compaction, I mean replacing existing HLogs with new HLogs created using the contents
of a memstore snapshot.  Situations where we can get huge wins:
> 1. In the incrementColumnValue case,  N HLog entries often correspond to a single memstore
entry.  Although we may have large HLog files, our memstore could be relatively small.
> 2. If we have a hot region, the majority of the HLog consists of that one region and
other region edits would be minuscule.
> In both cases, we are forced to flush a bunch of very small stores.  Its really hard
for a compaction algorithm to be efficient when it has no guarantees of the approximate size
of a new StoreFile, so it currently does unconditional, inefficient compactions.  Additionally,
compactions & flushes suck because they invalidate cache entries: be it memstore or LRUcache.
 If we can limit flushes to cases where we will have significant HFile output on a per-Store
basis, we can get improved performance, stability, and reduced failover time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message