hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clint Morgan <clint.mor...@troove.net>
Subject Re: Secondary indexes and transactions
Date Tue, 19 Jan 2010 17:32:22 GMT
Sorry I have been so slow in understanding. I now see what you mean. I was
trying to explain how I thought it *should* work, rather than what it
actually does now.

That method of aborting on an exception in the 2nd phase is incorrect for
the reason you mention: For a 3 region transaction, we could have committed
the first region, error-ed on the 2nd region, and then aborted the 3rd
region. So even dis-regarding indexes, we would lose our atomic property in
the base table.

Rather we can let just the 2nd region fail with the assumption that it has
all the information that it needs to get that transaction committed when it
recovers from the WAL. So when the 2nd region is finally recovered and ready
to serve again it will have the transaction committed.

For your second point about not aborting in the case of failure in the
regionserver, you also raise a valid point. A failure of the filesystem will
cause an abort, and then initiate the WAL recovery properly. However other
exceptions could sneak through (maybe an OOME failure on the Indexed put
rpc), and cause an inconsistent index and/or some of the trx puts not being
applied.

Rather we should probably be more explicit about handling IOE's in the
transactional layer. The trx region server needs to guarantee that when it
is told to commit a transaction, the writes will eventually occur. It may be
as simple as handling an exception in the commit methods by aborting the
region server, but this seems to fragile.

I've been delaying worrying to much in the details of transactional failure
recovery until we have append and a working write-ahead-log in core hbase.
But its probably about time to revisit...

Thank you very much for digging in here, a second set of eyes is handy.
-clint

On Tue, Jan 19, 2010 at 1:37 AM, Mridul Muralidharan
<mridulm@yahoo-inc.com>wrote:

> Clint Morgan wrote:
>
>> After the 2PC process has determined that a commit should happen there is
>> no
>> roll-back. The commit must be processed.
>>
>
>
> From org.apache.hadoop.hbase.client.transactional.TransactionManager
>
> doCommit() which is the 2nd phase of 2-phase commit, on throwing Exception
> results in abort() which does the rollback.
> And this abort specifically ignores the region which hit the error -
> thereby making the index go out of sync.
>
>
> I hope I am not missing something with this assertion, since I had
> mentioned this earlier too (possibly got buried in my details ?).
>
>
> Since abort is resulting in an rpc call, which results in some log
> manipulation, I left it at that and did not dig deeper - do you mean it
> actually does nothing ?
>
>
>
>
>> So in your example, a commit has been approved, and one the of the regions
>> is told to go ahead and commit. The region triggers the index Put, but
>> then
>> fails on his Puts (like out of disk space, out of memory, etc). This
>> should
>> shutdown the RegionServer. Then when the region's WAL is recovered from,
>> the
>> trx puts from the partially-committed transaction will be there. We will
>> look in the global transaction log to see that the trx is to be committed,
>> and then apply the puts to the base table.
>>
>
>
> I relooked at the implementation just to make sure I got the basic issue
> right.
> I did not see this behavior you mention above - of IOException resulting in
> shutting down of a region server - and quite a lot of methods actually could
> result in IOException's getting thrown when traversing the call-graph from
> indexedregion.Put's invocation (filesystem going missing is just one case
> where this happens I think - but I did not see this as being the only case :
> atleast impl/doc wise).
>
>
>
>
> Anyway, to make progress, if commit failure in a indexed regionserver does
> a rollback of the txn, then the issue I mentioned can occur ?
>
>
> Thanks for your patience and time !
>
> Regards,
> Mridul
>
>
>
>> -clint
>>
>> On Fri, Jan 15, 2010 at 2:43 AM, Mridul Muralidharan
>> <mridulm@yahoo-inc.com>wrote:
>>
>>  I think I might not have explained it well enough.
>>> As part of executing a Put, the index update happens prior to updating
>>> the
>>> underlying transactional table currently - and is done outside of the
>>> lock's.
>>> If the underlying transactional table update results in an exception -
>>> what
>>> is the state of the index ? From what I understand, a rollback is
>>> initiated
>>> - and this results in rolling back all regions - except for the one which
>>> threw the exception : and so the secondary index update which happened
>>> implicitly is never reverted.
>>> Or am I missing something here ?
>>>
>>> To be clear, I am talking about the actual commit as part of the two
>>> phase
>>> commit throwing an exception : not a conflict exception, but an
>>> IOException
>>> or variant - which can result in the secondary index going out of sync.
>>> I am contrasting it with the case of explicit indexes maintained by
>>> client
>>> - where the rollback by client (when the commit fails for a region)
>>> results
>>> in rollback on all the regions in the transaction - which includes the
>>> seconday indexes 'visible' to the client.
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>> Mridul
>>>
>>>
>>>
>>>
>>>
>>>  If the regionserver crashes during this commit process, then I *think*
>>>> it
>>>> should still recover correctly. It will see the transactional operations
>>>> in
>>>> the WAL, and the propagate the puts into the index. However this WAL
>>>> recovery stuff has been changing, and I'm not confident that it
>>>> currently
>>>> works in all failure cases.
>>>>
>>>> Does this normal case address your concerns?
>>>>
>>>> -clint
>>>>
>>>> On Sun, Jan 3, 2010 at 4:46 PM, Mridul Muralidharan
>>>> <mridulm@yahoo-inc.com>wrote:
>>>>
>>>>  stack wrote:
>>>>
>>>>>  On Sun, Jan 3, 2010 at 10:46 AM, Mridul Muralidharan
>>>>>
>>>>>> <mridulm@yahoo-inc.com>wrote:
>>>>>>
>>>>>>  I was wondering about the atomicity guarantees when using secondary
>>>>>>
>>>>>>  indexes from within a transaction.
>>>>>>>
>>>>>>>
>>>>>>>  You are talking about indexed hbase from transactional hbase
>>>>>>> contrib?
>>>>>>>
>>>>>>>  Yes, exactly.
>>>>>
>>>>>
>>>>>
>>>>>  From what I could gather, updates to the index table goes through its
>>>>>
>>>>>> own
>>>>>>
>>>>>>  (set of) rpc before the underlying transactional table is updated
-
>>>>>>> and
>>>>>>> these update happens outside of the locks for the transaction
table.
>>>>>>>
>>>>>>>
>>>>>>>  Yes.  But IIUC, the client is running a transaction that spans
the
>>>>>>>
>>>>>> update
>>>>>> to
>>>>>> the two tables.  It'll take care of the undo should say the update
to
>>>>>> the
>>>>>> transacation table fails.
>>>>>>
>>>>>>
>>>>>>  Isn't the update to the secondary index implicitly done ? As in,
does
>>>>>>
>>>>> the
>>>>> client 'see' this update ?
>>>>> My impression was that the secondary index update was done by the
>>>>> indexedregion - and was not visible to the client : which manages occ
>>>>> transaction ...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  Also, the index regions need not colocate with the table region.
>>>>>
>>>>>> So essentially wondering
>>>>>>> a) if the index can go out of sync with the transactional table
?
>>>>>>>
>>>>>>>
>>>>>>>  It should not.  The client should run the undos if the insert
does
>>>>>>> not
>>>>>>>
>>>>>> go
>>>>>> into both tables successfully.
>>>>>>
>>>>>>
>>>>>>
>>>>>>  b) if there are errors with update to table, are the indexes rolled
>>>>>> back
>>>>>>
>>>>>>  ?
>>>>>>>
>>>>>>>
>>>>>>>  Yes.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  c) Whether there can be issues if there are parallel updates invoked
>>>>>> for
>>>>>>
>>>>>>  the same row - whether index changes end up being inconsistent with
>>>>>>> table
>>>>>>> data (due to lock not being held while updating index).
>>>>>>>
>>>>>>>
>>>>>>>  This might be possible.  There is a lock held on a row.  I'm
not
>>>>>>> sure
>>>>>>>
>>>>>> if
>>>>>> the
>>>>>> lock is held on transaction table row while the update is being done
>>>>>> to
>>>>>> the
>>>>>> index table.
>>>>>>
>>>>>> This is the doc. as it stands on transactional hbase:
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/client/transactional/package-summary.html#package_description
>>>>>>
>>>>>> Here is the doc. on indexed-transactional hbase:
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://hadoop.apache.org/hbase/docs/r0.20.2/api/org/apache/hadoop/hbase/client/tableindexed/package-summary.html#package_description
>>>>>>
>>>>>> You've probably tripped over it already but just in case, it might
>>>>>> help.
>>>>>>
>>>>>>
>>>>>>  I did go through the package sumamries, thanks : which is what
>>>>> increased
>>>>> my
>>>>> confusion.
>>>>>
>>>>> My current understanding is :
>>>>>
>>>>> a) Client 'simulates' the transaction - by inspecting the state of the
>>>>> rows
>>>>> on commit and rolls back in case of conflicting updates.
>>>>>
>>>>> b) secondary index updates are transparent to client api and are
>>>>> directly
>>>>> done by the indexedregion as part of its implementation.
>>>>>
>>>>>
>>>>> If this is correct, I am wondering if overlapping rollbacks can result
>>>>> in
>>>>> secondary index going out of sync with the table since (a) does not see
>>>>> those (one update gets rolled back while another goes through - or
>>>>> variations of it).
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Mridul
>>>>>
>>>>>
>>>>>
>>>>>  St.Ack
>>>>>
>>>>>
>>>>>>  I guess they are all kind of related queries.
>>>>>>
>>>>>>>
>>>>>>> I was not able to get a clear picture from the archives, so
>>>>>>> RTFM/pointers
>>>>>>> would be helpful if this is already answered.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Mridul
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message