hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anastasia Braginsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16608) Introducing the ability to merge ImmutableSegments without copy-compaction or SQM usage
Date Tue, 18 Oct 2016 21:49:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586767#comment-15586767
] 

Anastasia Braginsky commented on HBASE-16608:
---------------------------------------------

Hi [~anoop.hbase] and [~ram_krish],

I first want to thank you for your unstoppable interest and support for this big CompactingMemStore
issue! You are getting everything very fast and see it all in the deep details. Really pleasure
to work with such smart people as you are. 

So you are saying that you want not to use any merge in order to have the things simpler.
Am I getting you right? I do not see how merges (especially if done once in a while) can affect
GC. The size of CellArrayMap is negligible relative to the data size. It is reasonable to
think that GC is busy with releasing the chunks that are not accessible after flush to disk.

When we hold more data in the memory either due to index or data compaction we flush bigger
files to the disk and bigger chunks of memory are need to be freed upon each flush to disk.
This sounds like a reason for making GC to work harder. If you see how merges directly affect
garbage collection please explain. All this makes me think that your suggestion (not to merge)
will result in the same GC behavior. We already had an implementation with flattening only,
and there (under stress) we have seen tens of segments in the compaction pipeline. I wonder
the performance of the gets and scans in this structure. The process of flushing multiple
segments to disk should also prolong the flushing to disk, which is undesirable. When we had
flushes waiting for merge Ram had seen lots of blocking writes till the system became unresponsive.
So I do not clearly see why not-to-merge-at-all is a better solution.

I am not attached to anything, but really trying to understand why one way is better then
the other. I mean can we measure the better quality of not merging relative to intermediate
merging?

Regarding the issue of N-MSLABs merge, raised in the RB, it looks irrelevant now, till we
decide whether to merge or not to merge. Thus let's discuss it later. I also believe this
is not such a big issue and we can arrange it this way or another. 

Best,
Anastasia

> Introducing the ability to merge ImmutableSegments without copy-compaction or SQM usage
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-16608
>                 URL: https://issues.apache.org/jira/browse/HBASE-16608
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Anastasia Braginsky
>         Attachments: HBASE-16417-V02.patch, HBASE-16417-V04.patch, HBASE-16417-V06.patch,
HBASE-16417-V07.patch, HBASE-16417-V08.patch, HBASE-16417-V10.patch, HBASE-16608-V01.patch,
HBASE-16608-V03.patch, HBASE-16608-V04.patch, HBASE-16608-V08.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message