accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-578) consider using hdfs for the walog
Date Tue, 22 May 2012 13:31:40 GMT


Eric Newton commented on ACCUMULO-578:

I did a quick measurement of the number of NN operations done over a short ingest period,
to evaluate if logging to HDFS would significantly increase pressure on the NN.

I pre-split the ingest table into 8 tablets and used goraci to ingest 80 million entries (240M
k-v).  I grabbed the jmx counters on the NN before the start of the map-reduce job, and then
10 minutes later.  That was enough time for the idle flush and the GC to run a couple of times.

As expected, the change was negligible.

I don't have a good sense of what the different costs of the different Ops might be... but
nothing stands out as horribly different.

> consider using hdfs for the walog
> ---------------------------------
>                 Key: ACCUMULO-578
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: logger, tserver
>    Affects Versions: 1.5.0-SNAPSHOT
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>         Attachments: HDFS_WAL_states.pdf, NNOpsComparison.pdf, comparison.png
> Using HDFS for walogs would fix:
>  * ACCUMULO-84: any node can read the replicated files
>  * ACCUMULO-558: wouldn't need to monitor loggers
>  * ACCUMULO-544: log references wouldn't include hostnames
>  * ACCUMULO-423: wouldn't need to monitor loggers
>  * ACCUMULO-258: hdfs has load balancing already
> To implement it, we would need the ability to distribute log sorts.
> Continuing to use loggers helps us avoid:
>  * hdfs pipeline strategy
>  * we don't have fine-grained insight when a single node makes dfs slow
>  * additional namenode pressure
>  * flexibility: for example, we can add fadvise() calls to the logger before HDFS supports

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message