hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhangduo (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk
Date Wed, 15 Oct 2014 03:58:34 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

zhangduo updated HBASE-10201:
    Attachment: HBASE-10201-0.98_2.patch

Running with TestPerColumnFamilyFlush.

3 CFs, 16B value for CF1, 256B value for CF2 and 4K value for CF3, 1M rows, 128M memstore
flush size, 16M CF flush size.

Result without per CF flush:
NumStoreFiles: 7, StoreFileSize: 4336644762, NumCompactionsCompleted: 46, NumFilesCompacted:
146, NumBytesCompacted: 11132103132
Write amplification: 2.57

Result with per CF flush:
NumStoreFiles: 10, StoreFileSize: 4482510274, NumCompactionsCompleted: 27, NumFilesCompacted:
89, NumBytesCompacted: 10353603767
Write amplification: 2.31

Next I will run this benchmark on a real cluster instead of minicluster.

> Port 'Make flush decisions per column family' to trunk
> ------------------------------------------------------
>                 Key: HBASE-10201
>                 URL: https://issues.apache.org/jira/browse/HBASE-10201
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ted Yu
>         Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, HBASE-10201-0.98_1.patch,
> Currently the flush decision is made using the aggregate size of all column families.
When large and small column families co-exist, this causes many small flushes of the smaller
CF. We need to make per-CF flush decisions.

This message was sent by Atlassian JIRA

View raw message