accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2766) Single walog operation may wait for multiple hsync calls
Date Tue, 10 Jun 2014 14:58:02 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026532#comment-14026532
] 

Keith Turner commented on ACCUMULO-2766:
----------------------------------------

bq. CI + Agitation is a pretty high bar for minimally ensuring that something works as intended

I can not really think of another test that would give me the confidence that this code works.
 I will ensure the test is run before 1.6.1 or 1.5.2 is released.

bq. What assumptions did the extra locking provide us with?

I am not sure why the locking around sync was added.  There was a race condition in close()
that I fixed in the 2nd patch.  Maybe this was observed in CI testing and the locking around
sync was a work around for it.  I am only guessing though, I did not track all of the changes
from ACCUMULO-119 to now (I tried and gave up svn, renames, merges, etc).   The purpose of
{{closeLock}} is to ensure nothing is added to the queue after the walog is closed.

bq. Initially it looks like we locked to offer work to the syncQueue, but did not lock to
poll? And now we do not lock for either?

The queue is a [LinkedBlockingQueue|http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/LinkedBlockingQueue.html]
which is thread safe.  We do not need to synchronize its use.  

> Single walog operation may wait for multiple hsync calls
> --------------------------------------------------------
>
>                 Key: ACCUMULO-2766
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2766
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.6.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>            Priority: Critical
>              Labels: performance
>             Fix For: 1.5.2, 1.6.1, 1.7.0
>
>         Attachments: ACCUMULO-2677-1.patch, ACCUMULO-2766-2.patch
>
>
> While looking into slow {{hsync}} calls, I noticed an oddity in the way Accumulo processes
syncs.  Specifically the way {{closeLock}} is used in {{DfsLogger}}, it seems like the following
situation could occur. 
>  
>  # thread B starts executing DfsLogger.LogSyncingTask.run()
>  # thread 1 enters DfsLogger.logFileData()
>  # thread 1 writes to walog
>  # thread 1 locks _closeLock_ 
>  # thread 1 adds sync work to workQueue
>  # thread 1 unlocks _closeLock_
>  # thread B takes sync work off of workQueue
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread 3 enters DfsLogger.logFileData()
>  # thread 3 writes to walog
>  # thread 3 blocks locking _closeLock_
>  # thread 4 enters DfsLogger.logFileData()
>  # thread 4 writes to walog
>  # thread 4 blocks locking _closeLock_
>  # thread B unlocks _closeLock_
>  # thread 4 locks _closeLock_ 
>  # thread 4 adds sync work to workQueue
>  # thread B takes sync work off of workQueue
>  # thread B blocks locking _closeLock_
>  # thread 4 unlocks _closeLock_
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread B unlocks _closeLock_
>  # thread 3 locks _closeLock_
>  # thread 3 adds sync work to workQueue
>  # thread 3 unlocks _closeLock_
> In this situation thread 3 unnecessarily has to wait for an extra {{hsync}} call.  Not
sure if this situation actually occurs, or if it occurs very frequently.  Looking at the code
it seems like it would be nice if sync operations could be queued w/o synchronizing w/ sync
operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message