hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anastasia Braginsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14921) Memory optimizations
Date Wed, 20 Jul 2016 11:12:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385690#comment-15385690
] 

Anastasia Braginsky commented on HBASE-14921:
---------------------------------------------

bq. Am not very sure on this. You mean most of the cases will have duplicates? There are use
cases we have seen where there is not much duplicates and each row is unique. Say in a time
based row key impl. 

No, I do not mean most of the cases will have duplicates. I am sure there are cases where
are no duplicates at all. I mean for example the cases where there are period of times with
more duplicates and periods with less. When it is not clearly known ahead of time. Usually,
the usecases with no duplicates at all and with lots of duplicates are rare. I just think
that 10-15% of duplicates should worth compaction...

bq. Yes minor compaction on the disk is a bottleneck because of IO. But in the case where
you have very less duplicates you are doing that operation twice, once in memory and once
in disk. This patch is not going to say that since memory compaction has been done avoid disk
minor compaction. Coming to deletes, there are use cases where the deletes are there but very
rare. So even when the in memory compaction is going to remove such deletes ( if it is encountered)
that is going to create a flush which is going to be slighly lesser in size but again the
minor compaction will be performed on this file also.

I agree with you that without duplicates in-memory compaction is unnecessary. I just wanted
to show that in case of little duplicates you gain more then space in memory.

The results are very interesting. On which version exactly was the estimation done? On my
previous patch? Let me give you a new and updated pach today.
Thank you, Ramkrishna!

> Memory optimizations
> --------------------
>
>                 Key: HBASE-14921
>                 URL: https://issues.apache.org/jira/browse/HBASE-14921
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>            Assignee: Anastasia Braginsky
>         Attachments: CellBlocksSegmentInMemStore.pdf, CellBlocksSegmentinthecontextofMemStore(1).pdf,
HBASE-14921-V01.patch, HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch,
HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, InitialCellArrayMapEvaluation.pdf, IntroductiontoNewFlatandCompactMemStore.pdf
>
>
> Memory optimizations including compressed format representation and offheap allocations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message