hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2353) HBASE-2283 removed bulk sync optimization for multi-row puts
Date Wed, 24 Mar 2010 01:48:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849009#action_12849009
] 

Todd Lipcon commented on HBASE-2353:
------------------------------------

bq. We cant do things like hold row locks for any substantial length of time, that introduces
the opportunities to deadlock.

Howso? Time of locking doesn't introduce deadlock, just order of locking. We could sort the
rows first to avoid internal deadlock. Of course, the fact that we expose locks to users does
break this - if a user has a lock on row C, and we try to lock A,B,C, we'll block on that
row while holding the others. If a user locks in the "wrong order" the problem's even worse
because we'd deadlock against the client.

So, I don't think we can hold multiple row locks at the same time, no matter how short a period
we do it for, assuming row locks continue to be user-exposed.

Unfortunately, the opposite problem is just as bad... if we do log(a) memstore(a) log(b) memstore(b)
syncAll(), then edit A becomes visible before it's synced, and that's a no-no.

I don't have any good solutions, but here's a bad solution: HBASE-2332 (remove user-exposed
row locks from region server). That JIRA could be made a little bit less drastic and say that
user-exposed row locks are advisory locks (eg like flock) and they don't block IO (just other
lock operations). That is to say, decouple the user exposed locks from the internal locks
needed for consistency purposes.

> HBASE-2283 removed bulk sync optimization for multi-row puts
> ------------------------------------------------------------
>
>                 Key: HBASE-2353
>                 URL: https://issues.apache.org/jira/browse/HBASE-2353
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2353-deferred.txt
>
>
> previously to HBASE-2283 we used to call flush/sync once per put(Put[]) call (ie: batch
of commits).  Now we do for every row.  
> This makes bulk uploads slower if you are using WAL.  Is there an acceptable solution
to achieve both safety and performance by bulk-sync'ing puts?  Or would this not work in face
of atomic guarantees?
> discuss!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message