hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3656) Merging flush; merge a flush with one of the existing store files (the smallest?) so we skip creating a new store file on each flush
Date Wed, 16 Mar 2011 20:44:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007686#comment-13007686
] 

Jonathan Gray commented on HBASE-3656:
--------------------------------------

I think we'd need some pretty well-thought-out heuristics for when to do it and when not to
do it.

I also had a stab at this when I worked on HBASE-2375 way back when (yeah, need to get back
on that one soon).

When I looked at this back then, a much bigger win was to upgrade a "flush then compaction"
to a "merge flush with files that would have been selected for compaction".  (we only compact
after a flush today, but before we do the flush, we can actually know whether a compaction
will be requested afterwards).  In this case, I think it would be a clear win in this case.

Otherwise, I think it would mostly make sense if the file it is merging with was relatively
small and the amount being flushed is relatively small (the kind of stuff that happens when
you have too many regions on each RS or not enough total HLog capacity --> small flushes).

> Merging flush; merge a flush with one of the existing store files (the smallest?) so
we skip creating a new store file on each flush
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3656
>                 URL: https://issues.apache.org/jira/browse/HBASE-3656
>             Project: HBase
>          Issue Type: Task
>            Reporter: stack
>
> This behavior is described in the BT paper.  Years ago I had a go at it but at the time
it slowed flushing significantly -- and IIRC we had no barriers on writes when the memory
pressue was high -- so it brought on OOMEs... so punted on it.  Its time to consider this
feature again.
> Would we always do it?  Maybe not if its a close?  If a close we want stuff to run quickly
so we should skip the merge.  But any other time, we should do it?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message