hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16421) Introducing the CellChunkMap as a new additional index variant in the MemStore
Date Tue, 27 Dec 2016 07:21:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779804#comment-15779804

ramkrishna.s.vasudevan commented on HBASE-16421:

I did some experiments. This was my set up:
I had one single node cluster and with YCSB as the test framework.
In order to test scan perf I had to do mixed read and write work load.  So created two client
instances one which was doing ycsb load with 100 threads (recordcount=10000000, operationcount=750000000).
Another instance of the ycsb client was performing pure scans with 50 threads.  (this imitates
the read/write workload with totally 150 threads)
The load phase used to run for about 16 mins and during this time the GC graphs were plotted
(We have G1GC configured). 
In YCSB it is not possible to mention the end key so scans are always never ending most of
the time. Hence I went with this approach. 
>From the 'jmc profiler' I could see that flushes were creating lot of garbage with chunk
map as every time we were creating a Cell and it was quite lot of small objects. 
Similarly the scan trace also created cell objects from the chunks. So overall with CellChunkmap
and read write mode we are generating more garbage (in terms of number) but the overall time
is not significantly large.
The numbers are as follows

||Type of memstore||Number of pauses||Total time||
|Array map based memstore|566|46.36s|
|Chunk map based memstore |899|51.56s|
|Default memstore (no MSLAB and chunkpool) |446|37.87s|
|Offheap memstore |469|29.8s|

LEt me know if we need to do some more tests or some other methodology to be adopted here.
But I think with mixed read/write load - yes we do generate more garbage (than the pure write).
So overall chunk map reduces the GC overhead (I can see that the mixed and young GC avg is
the lowest among the above) but since we have more small objects we have more count. So we
can still pursue with this CellChunkMap? Thoughts!!!

> Introducing the CellChunkMap as a new additional index variant in the MemStore
> ------------------------------------------------------------------------------
>                 Key: HBASE-16421
>                 URL: https://issues.apache.org/jira/browse/HBASE-16421
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Anastasia Braginsky
>         Attachments: CellChunkMapRevived.pdf, IntroductiontoNewFlatandCompactMemStore.pdf
> Follow up for HBASE-14921. This is going to be the umbrella JIRA to include all the parts
of integration of the CellChunkMap to the MemStore.

This message was sent by Atlassian JIRA

View raw message