hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suraj Varma <svarma...@gmail.com>
Subject Re: Question on Coprocessors and Atomicity
Date Sun, 04 Dec 2011 16:39:46 GMT
Jesse:
>> Quick soln - write a CP to check the single row (blocking the put).

Yeah - given that I want this to be atomically done, I'm wondering if
this would even work (because, I believe I'd need to unlock the row so
that the checkAndMutate can take the lock - so, there is a brief
window between where there is no lock being held - and some other
thread could take that lock). One option would be to pass in a lock to
checkAndMutate ... but that would increase the locking period and may
have performance implications, I think.

I see a lot of potential in the Constraints implementation - it would
really open up CAS operations to do functional constraint checking,
rather than just value comparisons.

--Suraj

On Sun, Dec 4, 2011 at 8:32 AM, Suraj Varma <svarma.ng@gmail.com> wrote:
> Thanks - I see that the lock is taken internal to checkAndMutate.
>
> I'm wondering whether it is a better idea to actually pass in a
> Constraint (or even Constraints) as the checkAndMutate argument. Right
> now it is taking in an Comparator and a CompareOp for verification.
> But, this could just be a special case of Constraint which is
> evaluated within the lock.
>
> In other words, we could open up a richer Constraint checking api
> where any "functional" Constraint check can be performed in the
> checkAndPut operation.
>
> This would also not have the same performance impact of taking a
> rowLock in preCheckAndPut and release in postCheckAndPut. And - it is
> really (in my mind) implementing the compare-and-set more generically.
>
> I also see the potential of passing in multiple constraints (say
> upper/lower bounds in Increment/Decrement operations) etc.
>
> --Suraj
>
>
> On Sat, Dec 3, 2011 at 7:44 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> From HRegionServer.checkAndPut():
>>    if (region.getCoprocessorHost() != null) {
>>      Boolean result = region.getCoprocessorHost()
>>        .preCheckAndPut(row, family, qualifier, CompareOp.EQUAL, comparator,
>>          put);
>> ...
>>    boolean result = checkAndMutate(regionName, row, family, qualifier,
>>      CompareOp.EQUAL, new BinaryComparator(value), put,
>>      lock);
>> We can see that the lock isn't taken for preCheckAndPut().
>>
>> To satisfy Suraj's requirement, I think a slight change to checkAndPut() is
>> needed so that atomicity can be achieved across preCheckAndPut() and
>> checkAndMutate().
>>
>> Cheers
>>
>> On Sat, Dec 3, 2011 at 4:54 PM, Suraj Varma <svarma.ng@gmail.com> wrote:
>>
>>> Just so my question is clear ... everything I'm suggesting is in the
>>> context of a single row (not cross row / table). - so, yes, I'm
>>> guessing obtaining a RowLock on the region side during preCheckAndPut
>>> / postCheckAndPut would certainly work. Which was why I was asking
>>> whether the pre/postCheckAndPut obtains the row lock or whether the
>>> row lock is only obtained within checkAndPut.
>>>
>>> Let's say the coprocessor takes a rowlock in preCheckAndPut ... will
>>> that even work? i.e. can the same rowlock be inherited by the
>>> checkAndPut api within that thread's context? Or will preCheckAndPut
>>> have to release the lock so that checkAndPut can take it (which won't
>>> work for my case, as it has to be atomic between the preCheck and
>>> Put.)
>>>
>>> Thanks for pointing me to the Constraints functionality - I'll take a
>>> look at whether it could potentially work.
>>> --Suraj
>>>
>>> On Sat, Dec 3, 2011 at 10:25 AM, Jesse Yates <jesse.k.yates@gmail.com>
>>> wrote:
>>> > I think the feature you are looking for is a Constraint. Currently they
>>> are
>>> > being added to 0.94 in
>>> > HBASE-4605<https://issues.apache.org/jira/browse/HBASE-4605>;
>>> > they are almost ready to be rolled in, and backporting to 0.92 is
>>> > definitely doable.
>>> >
>>> > However, Constraints aren't going to be quite flexible enough to
>>> > efficiently support what you are describing. For instance, with a
>>> > constraint, you are ideally just checking the put value against some
>>> simple
>>> > constraint (never over 10 or always an integer), but looking at the
>>> current
>>> > state of the table before allowing the put would currently require
>>> creating
>>> > a full blown connection to the local table through another HTable.
>>> >
>>> > In the short term, you could write a simple coprocessor to do this
>>> checking
>>> > and then move over to constraints (which are a simpler, more flexible,
>>> way
>>> > of doing this) when the necessary features have been added.
>>> >
>>> > It is worth discussing if it makes sense to have access to the local
>>> region
>>> > through a constraint, though that breaks the idea a little bit, it would
>>> > certainly be useful and not overly wasteful in terms of runtime.
>>> >
>>> > Supposing the feature would be added to talk to the local table, and
>>> since
>>> > the puts are going to be serialized on the regionserver (at least to that
>>> > single row you are trying to update), you will never get a situation
>>> where
>>> > the value added is over the threshold. If you were really worried about
>>> the
>>> > atomicity of the operation, then when doing the put, first get the
>>> RowLock,
>>> > then do the put and release the RowLock. However, that latter method is
>>> > going to be really slow, so should only be used as a stop gap if the
>>> > constraint doesn't work as expected, until a patch is made for
>>> constraints.
>>> >
>>> > Feel free to open up a ticket and link it to 4605 for adding the local
>>> > table access functionality, and we can discuss the de/merits of adding
>>> the
>>> > access.
>>> >
>>> > -Jesse
>>> >
>>> > On Sat, Dec 3, 2011 at 6:24 AM, Suraj Varma <svarma.ng@gmail.com>
wrote:
>>> >
>>> >> I'm looking at the preCheckAndPut / postCheckAndPut api with
>>> >> coprocessors and I'm wondering ... are these pre/post checks done
>>> >> _after_ taking the row lock or is the row lock only done within the
>>> >> checkAndPut api.
>>> >>
>>> >> I'm interested in seeing if we can implement something like:
>>> >> (in pseudo sql)
>>> >> update table-name
>>> >> set column-name = new-value
>>> >> where (column-value - new-value) > threshold-value
>>> >>
>>> >> Basically ... I want to enhance the checkAndPut to not just compare
>>> >> "values" ... but apply an arbitrary function on the value _atomically_
>>> >> in the Put call. Multiple threads would be firing these mutations and
>>> >> I'd like the threshold-value above to never be breached under any
>>> >> circumstance.
>>> >>
>>> >> Is there a solution that can be implemented either via checkAndPut or
>>> >> using coprocessors preCheckAndPut? If not, would this be a useful
>>> >> feature to build in HBase?
>>> >>
>>> >> Thanks,
>>> >> --Suraj
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > -------------------
>>> > Jesse Yates
>>> > 240-888-2200
>>> > @jesse_yates
>>>

Mime
View raw message