accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Brassard (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-1083) add concurrency to HDFS write-ahead log
Date Fri, 08 Mar 2013 16:54:13 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Luke Brassard updated ACCUMULO-1083:
------------------------------------

    Attachment: walog-replication-factor-performance.jpg

Here are the updated results, which show the performance gains expected from tweaking the
{{tserver.wal.replication}} setting.

||version||walog replication||MB/sec/node||entries ingested||
|accumulo-1.4.2|default|6.691|1024000000|
|accumulo-1.4.2|false|23.583|4096000000|
|accumulo-1.5.0-SNAPSHOT|default|5.734|1024000000|
|accumulo-1.5.0-SNAPSHOT|2|8.366|1024000000|
|accumulo-1.5.0-SNAPSHOT|1|14.883|1024000000|
|accumulo-1.5.0-SNAPSHOT|false|27.321|4096000000|

I ran the tests and cleared the results

I've included another screenshot that illustrates the tests with walogs at 2, 1, and none.
                
> add concurrency to HDFS write-ahead log
> ---------------------------------------
>
>                 Key: ACCUMULO-1083
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1083
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>            Reporter: Adam Fuchs
>             Fix For: 1.6.0
>
>         Attachments: walog-performance.jpg, walog-replication-factor-performance.jpg
>
>
> When running tablet servers on beefy nodes (lots of disks), the write-ahead log can be
a serious bottleneck. Today we ran a continuous ingest test of 1.5-SNAPSHOT on an 8-node (plus
a master node) cluster in which the nodes had 32 cores and 15 drives each. Running with write-ahead
log off resulted in a >4x performance improvement sustained over a long period.
> I believe the culprit is that the WAL is only using one file at a time per tablet server,
which means HDFS is only appending to one drive (plus replicas). If we increase the number
of concurrent WAL files supported on a tablet server we could probably drastically improve
the performance on systems with many disks. As it stands, I believe Accumulo is significantly
more optimized for a larger number of smaller nodes (3-4 drives).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message