accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Park (JIRA)" <>
Subject [jira] [Updated] (ACCUMULO-2668) slow WAL writes
Date Mon, 14 Apr 2014 19:01:17 GMT


Jonathan Park updated ACCUMULO-2668:

    Status: Patch Available  (was: Open)

Attaching a possible fix.

> slow WAL writes
> ---------------
>                 Key: ACCUMULO-2668
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Jonathan Park
>         Attachments: noflush.diff
> During continuous ingest, we saw over 70% of our ingest time taken up by writes to the
WAL. When we ran the DfsLogger in isolation (created one outside of the Tserver), we saw about
~25MB/s throughput (computed by taking the estimated size of the mutations sent to the DfsLogger
class divided by the time it took for it to flush + sync the data to HDFS).
> After investigating, we found one possible culprit was the NoFlushOutputStream. It is
a subclass of but does not override the write(byte[], int, int)
method signature. The javadoc indicates that subclasses of the FilterOutputStream should provide
a more efficient implementation.
> I've attached a small diff that illustrates and addresses the issue but this may not
be how we ultimately want to fix it.
> As a side note, I may be misreading the implementation of DfsLogger, but it looks like
we always make use of the NoFlushOutputStream, even if encryption isn't enabled. There appears
to be a faulty check in the implementation that I don't believe can be satisfied
(line 384).

This message was sent by Atlassian JIRA

View raw message