hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anastasia Braginsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17081) Flush the entire CompactingMemStore content to disk
Date Mon, 28 Nov 2016 10:56:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15701629#comment-15701629
] 

Anastasia Braginsky commented on HBASE-17081:
---------------------------------------------

Thank you for your insights [~ram_krish]!

bq. What I found was that with only flushing the tail anything more than 6 

Do you mean with merges? Merging every 6 segments in pipeline and flushing tail (?) 
It is reasonable that you got "too many store files" then. It should not happen with composite
snapshot. 
In average, every 4 in-memory-flushes there need to be flush-to-disk. Thus if THRESHOLD_PIPELINE_SEGMENTS
is higher than 5, the merges should be rare, unless the entire system is in stress.

bq. The one thing that could be a problem is that when we have scans then we need to scan
10 segments

This JIRA is intended to provide a *mechanism of composite snapshot* without *optimizing the
THRESHOLD_PIPELINE_SEGMENTS*. Under HBASE-16417, Eshcar is running experiments with infinite
THRESHOLD_PIPELINE_SEGMENTS. We want to set THRESHOLD_PIPELINE_SEGMENTS to be infinite here
if it doesn't cause any performance degradation. Then under HBASE-16417 we should come with
really optimal policy, which is going to play with all the parameters.

bq. What prompted you to ensure that flushing the entire pipeline is better than flushing
only the tail as you were doing earlier? I think our concern was more on flusing tail only
will create lot of small files mainly. Do you observe anyother thing when flushing only tail?

Initially, with flattening only, we had too many open files, as you saw it yourself. When
we introduced merge, you had reported some GC problems due to too many small indexes floating
around. Additionally without composite snapshot the CompositeMemStore is never cleared upon
single flush-to-disk, unless its active segment is empty since the previous flush-to-disk.
Pay attention that without composite snapshot, upon flush-to-disk request you are pushing
active to the pipeline and flushing the pipeline's tail only. So active is not flushed, unless
it is empty. Thus in order to flush the entire CompositeMemStore to disk you need multiple
flushes resulting in multiple files on disk, which is not desirable. So indeed the idea of
truly emptying the store upon flush-to-disk looks good to us.

> Flush the entire CompactingMemStore content to disk
> ---------------------------------------------------
>
>                 Key: HBASE-17081
>                 URL: https://issues.apache.org/jira/browse/HBASE-17081
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Anastasia Braginsky
>         Attachments: HBASE-17081-V01.patch, HBASE-17081-V02.patch, HBASE-17081-V03.patch,
Pipelinememstore_fortrunk_3.patch
>
>
> Part of CompactingMemStore's memory is held by an active segment, and another part is
divided between immutable segments in the compacting pipeline. Upon flush-to-disk request
we want to flush all of it to disk, in contrast to flushing only tail of the compacting pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message