hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2294) Enumerate ACID properties of HBase in a well defined spec
Date Tue, 16 Mar 2010 00:48:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845630#action_12845630
] 

ryan rawson commented on HBASE-2294:
------------------------------------

I agree on the implementation details.  I was just illustrating on how the code currently
works.  It helps to have a concrete example of how things are done when writing specs.  On
the plus side, what I described above I think both fits a good user experience, and is possible
to implement (as evidenced by having a working implementation thereof over in HBASE-2248).

I would also like to keep the term 'row lock' out - I think we could possibly have serialized
atomic updates to HBase without row locks (wow!).

One point of discussion, I think it's important to have a scanner stay 'up to date' as much
as possible.  Not only would it simplify the implementation (as is), it makes no sense without
a broader transaction promise to also provide a level of transaction isolation. If you are
doing an aggregate scan on a table via map reduce, we already provide a mechanism for giving
yourself a consistent view of the world, and that is the Scan#setTimeRange() call.  Supporting
it in a Scan would require carrying the consistency view information from region to region,
and without some serious changes we could not support that.  Given our existing support, I
would argue it is unnecessary to do further work to promise large scale scanner consistency.

Scanner consistency is already an issue in the master META scanner.  We have to double check
the results of a scan to avoid problematic things such as double assignment.  Keeping the
scanner more lively will help with this.

One area where users could have issues would be consuming/producing rows in the same job.
 The Map-reduce framework helps with this, with TIF you can read in one pass, and TOF you
write in another phase that are by necessity non-overlapping.

The more I think about it, the more I realize a user wants perfect isolation, they should
use Scan#setTimerange() - it supports everything you want: restartable scanners, simple semantics,
and cross-region support and has an existing implementation. 



> Enumerate ACID properties of HBase in a well defined spec
> ---------------------------------------------------------
>
>                 Key: HBASE-2294
>                 URL: https://issues.apache.org/jira/browse/HBASE-2294
>             Project: Hadoop HBase
>          Issue Type: Task
>          Components: documentation
>            Reporter: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.20.4, 0.21.0
>
>
> It's not written down anywhere what the guarantees are for each operation in HBase with
regard to the various ACID properties. I think the developers know the answers to these questions,
but we need a clear spec for people building systems on top of HBase. Here are a few sample
questions we should endeavor to answer:
> - For a multicell put within a CF, is the update made durable atomically?
> - For a put across CFs, is the update made durable atomically?
> - Can a read see a row that hasn't been sync()ed to the HLog?
> - What isolation do scanners have? Somewhere between snapshot isolation and no isolation?
> - After a client receives a "success" for a write operation, is that operation guaranteed
to be visible to all other clients?
> etc
> I see this JIRA as having several points of discussion:
> - Evaluation of what the current state of affairs is
> - Evaluate whether we currently provide any guarantees that aren't useful to users of
the system (perhaps we can drop in exchange for performance)
> - Evaluate whether we are missing any guarantees that would be useful to users of the
system

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message