Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 26630200BEA for ; Tue, 27 Dec 2016 08:22:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 24FD5160B31; Tue, 27 Dec 2016 07:22:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6EA99160B23 for ; Tue, 27 Dec 2016 08:21:59 +0100 (CET) Received: (qmail 54339 invoked by uid 500); 27 Dec 2016 07:21:58 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 54317 invoked by uid 99); 27 Dec 2016 07:21:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Dec 2016 07:21:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 5905D2C03DE for ; Tue, 27 Dec 2016 07:21:58 +0000 (UTC) Date: Tue, 27 Dec 2016 07:21:58 +0000 (UTC) From: "ramkrishna.s.vasudevan (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-16421) Introducing the CellChunkMap as a new additional index variant in the MemStore MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 27 Dec 2016 07:22:00 -0000 [ https://issues.apache.org/jira/browse/HBASE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779804#comment-15779804 ] ramkrishna.s.vasudevan commented on HBASE-16421: ------------------------------------------------ I did some experiments. This was my set up: I had one single node cluster and with YCSB as the test framework. In order to test scan perf I had to do mixed read and write work load. So created two client instances one which was doing ycsb load with 100 threads (recordcount=10000000, operationcount=750000000). Another instance of the ycsb client was performing pure scans with 50 threads. (this imitates the read/write workload with totally 150 threads) The load phase used to run for about 16 mins and during this time the GC graphs were plotted (We have G1GC configured). In YCSB it is not possible to mention the end key so scans are always never ending most of the time. Hence I went with this approach. From the 'jmc profiler' I could see that flushes were creating lot of garbage with chunk map as every time we were creating a Cell and it was quite lot of small objects. Similarly the scan trace also created cell objects from the chunks. So overall with CellChunkmap and read write mode we are generating more garbage (in terms of number) but the overall time is not significantly large. The numbers are as follows ||Type of memstore||Number of pauses||Total time|| |Array map based memstore|566|46.36s| |Chunk map based memstore |899|51.56s| |Default memstore (no MSLAB and chunkpool) |446|37.87s| |Offheap memstore |469|29.8s| LEt me know if we need to do some more tests or some other methodology to be adopted here. But I think with mixed read/write load - yes we do generate more garbage (than the pure write). So overall chunk map reduces the GC overhead (I can see that the mixed and young GC avg is the lowest among the above) but since we have more small objects we have more count. So we can still pursue with this CellChunkMap? Thoughts!!! > Introducing the CellChunkMap as a new additional index variant in the MemStore > ------------------------------------------------------------------------------ > > Key: HBASE-16421 > URL: https://issues.apache.org/jira/browse/HBASE-16421 > Project: HBase > Issue Type: Umbrella > Reporter: Anastasia Braginsky > Attachments: CellChunkMapRevived.pdf, IntroductiontoNewFlatandCompactMemStore.pdf > > > Follow up for HBASE-14921. This is going to be the umbrella JIRA to include all the parts of integration of the CellChunkMap to the MemStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)