accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Drob (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2766) Single walog operation may wait for multiple hsync calls
Date Tue, 10 Jun 2014 14:24:02 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026487#comment-14026487
] 

Mike Drob commented on ACCUMULO-2766:
-------------------------------------

CI + Agitation is a pretty high bar for minimally ensuring that something works as intended.
I do not know of anybody running those nightly (maybe [~elserj] is?)

At this point, I'm not worrying about performance, I trust the numbers that you posted (would
love to reproduce them eventually, but don't have time for it yet). What assumptions did the
extra locking provide us with? Anything relating to the state of the queue? Are we exposed
to a potential concurrent modification?

Initially it looks like we locked to offer work to the syncQueue, but did not lock to poll?
And now we do not lock for either?

Concurrent code can be difficult to understand, and to prove correctness on, so an extra test
might protect us against changes elsewhere down the line. And act as a useful documentation.

> Single walog operation may wait for multiple hsync calls
> --------------------------------------------------------
>
>                 Key: ACCUMULO-2766
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2766
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0, 1.5.1, 1.6.0
>            Reporter: Keith Turner
>            Assignee: Keith Turner
>            Priority: Critical
>              Labels: performance
>             Fix For: 1.5.2, 1.6.1, 1.7.0
>
>         Attachments: ACCUMULO-2677-1.patch, ACCUMULO-2766-2.patch
>
>
> While looking into slow {{hsync}} calls, I noticed an oddity in the way Accumulo processes
syncs.  Specifically the way {{closeLock}} is used in {{DfsLogger}}, it seems like the following
situation could occur. 
>  
>  # thread B starts executing DfsLogger.LogSyncingTask.run()
>  # thread 1 enters DfsLogger.logFileData()
>  # thread 1 writes to walog
>  # thread 1 locks _closeLock_ 
>  # thread 1 adds sync work to workQueue
>  # thread 1 unlocks _closeLock_
>  # thread B takes sync work off of workQueue
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread 3 enters DfsLogger.logFileData()
>  # thread 3 writes to walog
>  # thread 3 blocks locking _closeLock_
>  # thread 4 enters DfsLogger.logFileData()
>  # thread 4 writes to walog
>  # thread 4 blocks locking _closeLock_
>  # thread B unlocks _closeLock_
>  # thread 4 locks _closeLock_ 
>  # thread 4 adds sync work to workQueue
>  # thread B takes sync work off of workQueue
>  # thread B blocks locking _closeLock_
>  # thread 4 unlocks _closeLock_
>  # thread B locks _closeLock_
>  # thread B calls sync
>  # thread B unlocks _closeLock_
>  # thread 3 locks _closeLock_
>  # thread 3 adds sync work to workQueue
>  # thread 3 unlocks _closeLock_
> In this situation thread 3 unnecessarily has to wait for an extra {{hsync}} call.  Not
sure if this situation actually occurs, or if it occurs very frequently.  Looking at the code
it seems like it would be nice if sync operations could be queued w/o synchronizing w/ sync
operations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message