accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1755) BatchWriter blocks all addMutation calls while binning mutations
Date Tue, 01 Mar 2016 22:18:18 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174521#comment-15174521
] 

ASF GitHub Bot commented on ACCUMULO-1755:
------------------------------------------

Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/75#discussion_r54645749
  
    --- Diff: core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java
---
    @@ -427,11 +437,11 @@ public void updateBinningStats(int count, long time, Map<String,TabletServerMuta
         updateBatchStats(binnedMutations);
       }
     
    -  private synchronized void updateBatchStats(Map<String,TabletServerMutations<Mutation>>
binnedMutations) {
    -    tabletServersBatchSum += binnedMutations.size();
    +  private void updateBatchStats(Map<String,TabletServerMutations<Mutation>>
binnedMutations) {
    +    tabletServersBatchSum.addAndGet(binnedMutations.size());
     
    -    minTabletServersBatch = Math.min(minTabletServersBatch, binnedMutations.size());
    -    maxTabletServersBatch = Math.max(maxTabletServersBatch, binnedMutations.size());
    +    minTabletServersBatch.set(Math.min(minTabletServersBatch.get(), binnedMutations.size()));
    +    maxTabletServersBatch.set(Math.max(maxTabletServersBatch.get(), binnedMutations.size()));
    --- End diff --
    
    This method of updating has a race condition.   Multiple threads could call get() before
calling set().  Also all of these atomic vars require round trips to main memory (not sure
how much this matters).   I can think of two possible solutions.  Both involve creating a
BatchWriterStats class to make the code more managable.
    
     1. Could add a syncrhonized updateBatchStats method to BatchWriterStats.  No longer syncing
on main lock or making lots of trips to main mem.
     2. Could have an AtomicRef<BatchWriterStats>.  To update batch writer stats read
the ref, clone it, make updates to clone, update ref using CAS to ensure ref has not changed.
 If ref changed, then start over.  This avoids lock, race conditions, and lots of trips to
main memory.  



> BatchWriter blocks all addMutation calls while binning mutations
> ----------------------------------------------------------------
>
>                 Key: ACCUMULO-1755
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1755
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client
>            Reporter: Adam Fuchs
>            Assignee: Dave Marion
>             Fix For: 1.8.0
>
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Through code inspection, we found that the BatchWriter bins mutations inside of a synchronized
block that covers calls to addMutation. Binning potentially involves lookups of tablet metadata
and processes a fair amount of information. We will get better parallelism if we can either
unlock the lock while binning, dedicate another thread to do the binning, or use one of the
send threads to do the binning.
> This has not been verified empirically yet, so there is not yet any profiling info to
indicate the level of improvement that we should expect. Profiling and repeatable demonstration
of this performance bottleneck should be the first step on this ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message