hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allan Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17924) Consider sorting the row order when processing multi() ops before taking rowlocks
Date Mon, 17 Apr 2017 01:19:41 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970575#comment-15970575
] 

Allan Yang commented on HBASE-17924:
------------------------------------

It won't be a deadlock in batch ops even if not sorted. Since when taking the row lock, the
thread will not wait for the lock, instead, it will fail fast then collecting the rows which
are not successfully written, and try them again. Please refer to step1 in {{doMiniBatchMutate}}.

But, I also thinking that ordering rows before doing any operations is better. Though now
it can retry when failed to acquire locks, sorting may decrease the possibility of those fails
a little bit and increase the performance.

If you don't mind, I can provide a patch, we already have done the sorting in our branch of
code.


> Consider sorting the row order when processing multi() ops before taking rowlocks
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-17924
>                 URL: https://issues.apache.org/jira/browse/HBASE-17924
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>
> When processing a batch mutation, we take row locks in whatever order the mutations were
added to the multi op by the client.
>  
> {noformat}
> RSRpcServices#multi -> RSRpcServices#mutateRows -> HRegion#mutateRow -> HRegion#mutateRowsWithLocks
-> HRegion#processRowsWithLocks
> {noformat}
> Or
> {noformat}
> RSRpcServices#multi -> RSRpcServices#doNonAtomicRegionMutation ->
>       HRegion#get 
>     | HRegion#append 
>     | HRegion#increment 
>     | HRegionServer#doBatchOp -> HRegion#batchMutate -> HRegion#doMiniBatchMutation
> {noformat}
>  
> multi() is fed by client APIs that accept a RowMutations object containing actions for
multiple rows. The container for ops inside RowMutations is an ArrayList, which doesn't change
the ordering of objects added to it. The protobuf implementation of the messages for multi
ops do not reorder the list of actions. When processing multi ops we iterate over the actions
in the order rehydrated from protobuf.
> We should discuss sorting the order of ops by row key when processing multi() ops before
taking row locks. Does this make lock ordering more predictable for server side operations?
Yes, but potentially surprising for the client, right? Is there any legitimate reason we should
take locks out of row key sorted order because the client has structured the request as such?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message