hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage
Date Tue, 12 Jan 2016 10:46:39 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Yu Li updated HBASE-14457:
    Attachment: Action in Multiple WAL.pdf

Here comes the doc, sorry for the lag but I hope it's worth the waiting. :-)

I'd like to highlight the testing result:
* PerformanceEvaluation testing with pure SATA disks shows a ~20% performance improvement
on writes, w/ 4 WALs per regionserver
* Monitoring data of our online production cluster (800+ nodes) shows a ~40% performance improvements
in mutate latency with mixed workloads
* hsync writes with 4 WALs and PCIE-SSD shows promising throughput (20k per server) and latency
(5.5ms on average)

Refer to the doc for more details, it also talks about the design and usage of multiple WAL.

Feel free to let me know if you have any comments/questions. Thanks.

> Umbrella: Improve Multiple WAL for production usage
> ---------------------------------------------------
>                 Key: HBASE-14457
>                 URL: https://issues.apache.org/jira/browse/HBASE-14457
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: Yu Li
>            Assignee: Yu Li
>             Fix For: 2.0.0, 1.3.0
>         Attachments: Action in Multiple WAL.pdf
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did a great
initial work there, but when trying to use it in our production cluster, we still found several
issues to resolve, like tracking multiple WAL paths in replication (HBASE-6617), fixing UT
with multiwal provider (HBASE-14411), introducing a namespace-based strategy for RegionGroupingProvider
(HBASE-14456), etc. This is an umbrella including(but not limited of) all these works and
efforts to make multiple wal ready for production usage and give user a clear picture about
> Besides the developing works done, I'd also like to share some scenarios and testing/online
data in this JIRA about our usage/performance of multiple wal, to(hopefully) help people better
judge whether to enable multiple wal or not in their own cluster and what they might gain.

This message was sent by Atlassian JIRA

View raw message