hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: HTable checkAndPut equivalent for Deletes
Date Fri, 30 Apr 2010 21:57:46 GMT

We do need a 'check and delete' but it should really be more like a
'check and mutate' where the mutation could be a delete or a put.

As for using explicit locks, the problem with explicit that is lock
waiters will consume a handler thread (there is only so many of them!)
and eventually you will DoS yourself and the unlocker won't be able to
unlock the lock that is holding everyone else up!  Locks do expire,
but a 60 second pause is not ideal.  So if you expect contention, this
is not a good solution.  If you expect minimal/no contention then it
might be ok.


On Fri, Apr 30, 2010 at 2:51 PM, Michael Dalton <mwdalton@gmail.com> wrote:
> Hi everyone,
> I have a quick question -- I'd like to do a simple atomic check-and-Delete
> for a row. For Put operations, HTable.checkAndPut appears to allow a simple
> atomic compare-and-update, which is great. However, there doesn't seem to be
> an equivalent function for deletes.
> I was thinking about approximating this by writing NULL or zero-length byte
> array as a value in a Put to emulating deleting a cell. It appears that
> checkAndPut already treats a zero-length array as equivalent to a
> non-existent value when performing its comparison (before committing the
> Put). The only drawback I can see to this is that I never truly remove rows,
> I just end up with 'dead' rows containing empty byte arrays, so I'd imagine
> that every N hours or days I would need to garbage collect these empty rows
> somehow (which brings us back full circle to the issue of how to atomically
> check and delete a row).
> The only real alternative I can see for doing this would be to emulate
> checkAndDelete by using RowLocks to lock the row, perform a Get, verify that
> the row contains the expected value, then perform a delete, and then unlock
> the row itself. Correct me if I'm wrong, but this should definitely emulate
> the semantics of atomic compare-and-Delete (assuming the compare and delete
> operate on the same row and use the RowLock). However, I'm not sure what the
> performance would be for using RowLocks to emulate checkAndDelete on the
> client side vs. using Put+checkAndPut to emulate checkAndDelete on the
> server side. Does anyone have any advice on this issue, or any idea what the
> relative tradeoffs are?
> In the long run, it seems to me that the clearly optimal solution would be
> to have a checkAndDelete function in HTable, and I'd be interesting in
> adding this functionality if no one else is currently working on it. Is that
> something that would be interesting to integrate and worth committing back
> to mainline? Are there any hidden pitfalls I should be aware of, or some
> technical/design reason for why this API call doesn't already exist? If not,
> I'll take a hard look at the delete and checkAndPut code in the regionserver
> and once sometime soon open an issue in JIRA and start coding.
> Best regards,
> Mike

View raw message