hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dalton <mwdal...@gmail.com>
Subject Re: HTable checkAndPut equivalent for Deletes
Date Fri, 30 Apr 2010 22:56:31 GMT
Thanks Ryan and Jonathan, I'll just do the check-and-Put approach just to
get this application into staging. Then I'll file a JIRA soon and start on
adding a generic checkAndMutate to handle Puts/Deletes.

Best regards,

Mike

On Fri, Apr 30, 2010 at 2:57 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:

> Hey,
>
> We do need a 'check and delete' but it should really be more like a
> 'check and mutate' where the mutation could be a delete or a put.
>
> As for using explicit locks, the problem with explicit that is lock
> waiters will consume a handler thread (there is only so many of them!)
> and eventually you will DoS yourself and the unlocker won't be able to
> unlock the lock that is holding everyone else up!  Locks do expire,
> but a 60 second pause is not ideal.  So if you expect contention, this
> is not a good solution.  If you expect minimal/no contention then it
> might be ok.
>
> -ryan
>
> On Fri, Apr 30, 2010 at 2:51 PM, Michael Dalton <mwdalton@gmail.com>
> wrote:
> > Hi everyone,
> >
> > I have a quick question -- I'd like to do a simple atomic
> check-and-Delete
> > for a row. For Put operations, HTable.checkAndPut appears to allow a
> simple
> > atomic compare-and-update, which is great. However, there doesn't seem to
> be
> > an equivalent function for deletes.
> >
> > I was thinking about approximating this by writing NULL or zero-length
> byte
> > array as a value in a Put to emulating deleting a cell. It appears that
> > checkAndPut already treats a zero-length array as equivalent to a
> > non-existent value when performing its comparison (before committing the
> > Put). The only drawback I can see to this is that I never truly remove
> rows,
> > I just end up with 'dead' rows containing empty byte arrays, so I'd
> imagine
> > that every N hours or days I would need to garbage collect these empty
> rows
> > somehow (which brings us back full circle to the issue of how to
> atomically
> > check and delete a row).
> >
> > The only real alternative I can see for doing this would be to emulate
> > checkAndDelete by using RowLocks to lock the row, perform a Get, verify
> that
> > the row contains the expected value, then perform a delete, and then
> unlock
> > the row itself. Correct me if I'm wrong, but this should definitely
> emulate
> > the semantics of atomic compare-and-Delete (assuming the compare and
> delete
> > operate on the same row and use the RowLock). However, I'm not sure what
> the
> > performance would be for using RowLocks to emulate checkAndDelete on the
> > client side vs. using Put+checkAndPut to emulate checkAndDelete on the
> > server side. Does anyone have any advice on this issue, or any idea what
> the
> > relative tradeoffs are?
> >
> > In the long run, it seems to me that the clearly optimal solution would
> be
> > to have a checkAndDelete function in HTable, and I'd be interesting in
> > adding this functionality if no one else is currently working on it. Is
> that
> > something that would be interesting to integrate and worth committing
> back
> > to mainline? Are there any hidden pitfalls I should be aware of, or some
> > technical/design reason for why this API call doesn't already exist? If
> not,
> > I'll take a hard look at the delete and checkAndPut code in the
> regionserver
> > and once sometime soon open an issue in JIRA and start coding.
> >
> > Best regards,
> >
> > Mike
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message