hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17924) Consider sorting the row order when processing multi() ops before taking rowlocks
Date Sat, 15 Apr 2017 00:29:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969714#comment-15969714
] 

James Taylor commented on HBASE-17924:
--------------------------------------

The reason you'd want to take the locks in a known, consistent manner is to prevent deadlocks.
Consider a case with two clients each modifying the same two rows in a single batch. If client
#1 submits a batch with rows {A, B} and client #2 submits a batch with rows {B, A} you have
the potential to deadlock without ordering the locks in some consistent, predictable way wrt
each other, for example by row key. It's the classic "Dining Philosophers" problem (https://en.wikipedia.org/wiki/Dining_philosophers_problem).

This could be completely transparent to the client as it would only impact the order the locks
are taken and released in - the rows could still be processed in the order they were sent
over from the client.



> Consider sorting the row order when processing multi() ops before taking rowlocks
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-17924
>                 URL: https://issues.apache.org/jira/browse/HBASE-17924
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>
> When processing a batch mutation, we take row locks in whatever order the mutations were
added to the multi op by the client.
>  
> {noformat}
> RSRpcServices#multi -> RSRpcServices#mutateRows -> HRegion#mutateRow -> HRegion#mutateRowsWithLocks
-> HRegion#processRowsWithLocks
> {noformat}
> Or
> {noformat}
> RSRpcServices#multi -> RSRpcServices#doNonAtomicRegionMutation ->
>       HRegion#get 
>     | HRegion#append 
>     | HRegion#increment 
>     | HRegionServer#doBatchOp -> HRegion#batchMutate -> HRegion#doMiniBatchMutation
> {noformat}
>  
> multi() is fed by client APIs that accept a RowMutations object containing actions for
multiple rows. The container for ops inside RowMutations is an ArrayList, which doesn't change
the ordering of objects added to it. The protobuf implementation of the messages for multi
ops do not reorder the list of actions. When processing multi ops we iterate over the actions
in the order rehydrated from protobuf.
> We should discuss sorting the order of ops by row key when processing multi() ops before
taking row locks. Does this make lock ordering more predictable for server side operations?
Yes, but potentially surprising for the client, right? Is there any legitimate reason we should
take locks out of row key sorted order because the client has structured the request as such?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message