hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Spiegelberg (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3242) HLog Compactions
Date Fri, 10 Dec 2010 02:13:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970049#action_12970049
] 

Nicolas Spiegelberg commented on HBASE-3242:
--------------------------------------------

so, talking about this internally.  I wasn't 100% thinking along Stack's lines.  My assumption
is that we could support per-CF flushing of MemStore and would therefore have different seqno
per-CF to prune on.  This would prevent a slow-growing CF from being flushed until it reaches
a significant size.

another thing to keep in mind: 

HFile compaction = read HFiles + merge + write new HFIles
HLog compaction = snapshot MemStore + prune + write aggregate HLog.

so HLog compaction only adds write IO, not read IO.  All said, Karthik's HBASE-3327 suggestion
would be much easier to implement in the short term since HFile compactions would require
merging the snapShot MemStore + current MemStore after compaction has finished.

> HLog Compactions
> ----------------
>
>                 Key: HBASE-3242
>                 URL: https://issues.apache.org/jira/browse/HBASE-3242
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Nicolas Spiegelberg
>
> Currently, our memstore flush algorithm is pretty trivial.  We let it grow to a flushsize
and flush a region or grow to a certain log count and then flush everything below a seqid.
 In certain situations, we can get big wins from being more intelligent with our memstore
flush algorithm.  I suggest we look into algorithms to intelligently handle HLog compactions.
 By compaction, I mean replacing existing HLogs with new HLogs created using the contents
of a memstore snapshot.  Situations where we can get huge wins:
> 1. In the incrementColumnValue case,  N HLog entries often correspond to a single memstore
entry.  Although we may have large HLog files, our memstore could be relatively small.
> 2. If we have a hot region, the majority of the HLog consists of that one region and
other region edits would be minuscule.
> In both cases, we are forced to flush a bunch of very small stores.  Its really hard
for a compaction algorithm to be efficient when it has no guarantees of the approximate size
of a new StoreFile, so it currently does unconditional, inefficient compactions.  Additionally,
compactions & flushes suck because they invalidate cache entries: be it memstore or LRUcache.
 If we can limit flushes to cases where we will have significant HFile output on a per-Store
basis, we can get improved performance, stability, and reduced failover time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message