hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eshcar Hillel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions
Date Fri, 10 Mar 2017 12:22:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904985#comment-15904985
] 

Eshcar Hillel commented on HBASE-16417:
---------------------------------------

We already ran some experiments with merge with really good results for write-only workload
and avoiding the extra overhead in mixed workloads.
We though the right way to go was first to refresh the code, commit to master, and then re-run
them and publish the result.
You can review the code in HBASE-17765.

In the past we ran experiments with value=1KB (see penultimate report) but since then the
code changed a lot. Indeed the affect of reducing the meta data decreases as the size of data
itself increases. It's a good idea to run (at least some of the experiments) with 1KB values

We were unable to get greater throughput with sync wal mode (even with more than 12 threads)
so we decided to test with async wal which helps simulate greater load by its nature.
Batching at the client side is for the same reason -- it significantly increases the load
on the servers and reduces the running time by order of magnitude.

Note that in sync wal mode all policies have the same number of wal files and the same volume
of wal data. The number of wal file is smaller with async wal for all policies (in zipfian
and uniform key distribution). When you get the answer to why thIS happens it might explain
the number of wal files in eager policy.

Number of pipeline segments: while 4*4=16 would be the maximal number 4/2=2 would be the number
in expectation. 

GC generally takes less than 1% of the running time. Since all experiments run with the same
GC parameters I don't think its important which parameters we use. We are not trying to optimize
the performance here but just to have a fair comparison under high load and high volume of
data. 

> In-Memory MemStore Policy for Flattening and Compactions
> --------------------------------------------------------
>
>                 Key: HBASE-16417
>                 URL: https://issues.apache.org/jira/browse/HBASE-16417
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Anastasia Braginsky
>            Assignee: Eshcar Hillel
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16417-benchmarkresults-20161101.pdf, HBASE-16417-benchmarkresults-20161110.pdf,
HBASE-16417-benchmarkresults-20161123.pdf, HBASE-16417-benchmarkresults-20161205.pdf, HBASE-16417-benchmarkresults-20170309.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message