hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-16608) Introducing the ability to merge ImmutableSegments without copy-compaction or SQM usage
Date Wed, 19 Oct 2016 10:56:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588008#comment-15588008
] 

Anoop Sam John edited comment on HBASE-16608 at 10/19/16 10:56 AM:
-------------------------------------------------------------------

Another imp thing which came out now
We elsewhere discussed abt the duplication of either cell data or index (In compaction or
merge) case and Ram just mentioned it once again.
Just case of merge which just copies the cell refs from 2+ arrays into one.
It is not temp existance of duplciated refs and it is not just one copy.  Let me explain with
an eg:

2 segments in pipeline getting merged where each having 100 cells.  So each having a cell
array with 100 refs. (Each ref is 8 bytes) Now we make a merge and that create a new segment
and that contain a cell array with 200 entries.  Ya there are no copies of the cells. But
new 200 cell refs coming up.
So before 1st merge there were 200 cell refs totally and now it is 400.
Now say one more segment came in and we merge together this new and old *merged* one.  Consider
new segment also have 100 cells. So before this merge op there are totally 400 + 100 = 500
refs
2nd merge also wont touch the segments and it creates a new segment and new array will contain
300 refs.  So totally 800 refs.
Actual cell objects are 300 and 2 merges made the refs# to be 800.  So this not even doubling..
 Like this if there are so many merge happens before a flush we will waste lots and lots of
heap space with filling up refs.



was (Author: anoop.hbase):
Another imp thing which came out now
We elsewhere discussed abt the duplication of either cell data or index (In compaction or
merge) case and Ram just mentioned it once again.
Just case of merge which just copies the cell refs from 2+ arrays into one.
It is not temp existance of duplciated refs and it is not just one copy.  Let me explain with
an eg:

2 segments in pipeline getting merged where each having 100 cells.  So each having a cell
array with 100 refs. (Each ref is 8 bytes) Now we make a merge and that create a new segment
and that contain a cell array with 200 entries.  Ya there are no copies of the cells. But
new 200 cell refs coming up.
So before 1st merge there were 200 cell refs totally and now it is 400.
Now say one more segment came in and we merge together this new and old *merged* one.  Consider
new segment also have 100 cells. So before this merge op there are totally 400 + 100 = 500
refs
2nd merge also wont touch the segments and it creates a new segment and new array will contain
300 refs.  So totally 800 refs.
Actual cell objects are 300 and 2 merges made the refs# to be 800.  So this not even doubling..
 Like this if there are so many merge happens before a flush we will waste lots and lots of
heap space with filling up refs.

Just test with this attached patch which is not solving the problem  correctly. It is a hack
and that can not be considered any way.  Just to make sure this fix is improving the full
GC pattern am sending u.


> Introducing the ability to merge ImmutableSegments without copy-compaction or SQM usage
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-16608
>                 URL: https://issues.apache.org/jira/browse/HBASE-16608
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Anastasia Braginsky
>         Attachments: HBASE-16417-V02.patch, HBASE-16417-V04.patch, HBASE-16417-V06.patch,
HBASE-16417-V07.patch, HBASE-16417-V08.patch, HBASE-16417-V10.patch, HBASE-16608-V01.patch,
HBASE-16608-V03.patch, HBASE-16608-V04.patch, HBASE-16608-V08.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message