accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-4154) Improve batch writer
Date Sat, 27 Feb 2016 00:08:18 GMT


ASF GitHub Bot commented on ACCUMULO-4154:

Github user keith-turner closed the pull request at:

> Improve batch writer
> --------------------
>                 Key: ACCUMULO-4154
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Keith Turner
> The batch writer currently has two drawbacks :
>  * It waits for its memory to be half full and then bins mutations for send threads.
 I don't think this is optimal.   Think it would be better to keep the send threads busy.
 As soon as there are mutation start working on them. If the send threads can not keep up,
then work will naturally build up (w/o waiting for memory to be .5 full)
>  * The flush method blocks threads trying to add anything to the batch writer.
> Thinking of implementing the following model for the batch writer, which is similar to
how the conditional writer works.
>   * Have a queue that all incoming mutations are added to.
>   * Have a queue per tablet server
>   * Have a single thread thats constantly taking batches of mutations off the incoming
queue, binning them, and placing them on tablet server queues.
>   * When a send thread becomes idle, have it select and reserver the tablet server queue
with the most work on it.
>   * when mutations fail, send threads can add them back to the incoming queue
> To get better flushing behavior, as each mutation is added to the batch writer it can
be assigned a one up counter.   We can keep track of the minimum in progress mutation.  Flush
can inspect this counter and wait for the minimum active mutation to reach a certain count.

This message was sent by Atlassian JIRA

View raw message