accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-4154) Improve batch writer
Date Fri, 26 Feb 2016 19:53:18 GMT
Keith Turner created ACCUMULO-4154:
--------------------------------------

             Summary: Improve batch writer
                 Key: ACCUMULO-4154
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4154
             Project: Accumulo
          Issue Type: Improvement
            Reporter: Keith Turner


The batch writer currently has two drawbacks :

 * It waits for its memory to be half full and then bins mutations for send threads.  I don't
think this is optimal.   Think it would be better to keep the send threads busy.  As soon
as there are mutation start working on them. If the send threads can not keep up, then work
will naturally build up (w/o waiting for memory to be .5 full)
 * The flush method blocks threads trying to add anything to the batch writer.

Thinking of implementing the following model for the batch writer, which is similar to how
the conditional writer works.

  * Have a queue that all incoming mutations are added to.
  * Have a queue per tablet server
  * Have a single thread thats constantly taking batches of mutations off the incoming queue,
binning them, and placing them on tablet server queues.
  * When a send thread becomes idle, have it select and reserver the tablet server queue with
the most work on it.
  * when mutations fail, send threads can add them back to the incoming queue

To get better flushing behavior, as each mutation is added to the batch writer it can be assigned
a one up counter.   We can keep track of the minimum in progress mutation.  Flush can inspect
this counter and wait for the minimum active mutation to reach a certain count.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message