hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (HBASE-5229) Provide basic building blocks for "multi-row" local transactions.
Date Thu, 09 Feb 2012 21:19:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200911#comment-13200911
] 

Lars Hofhansl edited comment on HBASE-5229 at 2/9/12 9:17 PM:
--------------------------------------------------------------


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > A couple of nits and small implementation details, but overall looks pretty good.

You're looking at an old version of the patch. :)


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java,
line 3160
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3160>
bq.  >
bq.  >     But in the comments on the MultiRowMutation you push that checking off onto
the RS, so no checking really happens then (except, I guess when you try to mutate rows on
the region and it fails b/c those rows aren't there, but that seems kinda late for the check).

Checking is happening the region.internalMutate.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java,
line 786
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72037#file72037line786>
bq.  >
bq.  >     I think is this unnecessary, javadoc should handle inheriting the docs.

It's done elsewhere, it is good to call out that no doc was added here, because the interface
has the doc.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java,
line 284
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72038#file72038line284>
bq.  >
bq.  >     or presplitting as is described in other documenttation.

Yes, should add this.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java,
line 35
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line35>
bq.  >
bq.  >     Probably want to wrap NOTE in <b> tags to call it out.

Sure.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java,
line 45
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line45>
bq.  >
bq.  >     A javadoc here might be nice to indicate that the nullary constructor is actually
completely ok to use (as opposed to the more common state of being reserved for readFields).

Good point. Although unless it is called out that you cannot use a constructor there should
be no reason whyt you couldn't.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java,
line 64
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line64>
bq.  >
bq.  >     Even though it uses protected structures doesn't mean that its necessarily thread
safe. In fact, because it is using the standard ArrayList, there is no guarantee of safety.
Either the class should be marked as not thread safe OR the mutations should be wrapped as
a concurrent list.

I disagree.
This is a client side object and none of the client side objects are threadsafe nor should
they be (see Put.java/Delete.java/Increment.java/Append.java/etc), that's the task of client
application.

I misread Ted's comment before, of course this method is not threadsafe.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/RowMutation.java,
line 95
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72040#file72040line95>
bq.  >
bq.  >     You really don't need to keep around the row anymore either because you can
get that from the mutations as you already do mutateRows with MultiRowMutation. Its nice to
store it, but is only going to be checked infrequently and saves you a little bit over the
wire (which could add up, depending on row size).

Same as Put and Delete (where every KV already has the row).
There is room optimization, but this is not the jira to do that.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java,
line 4161
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4161>
bq.  >
bq.  >     Suprised this isn't a utility method in HRegion - it seems really useful. Maybe
worth pulling out for general use.

internalMutate?


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java,
line 4181
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4181>
bq.  >
bq.  >     This isn't actually true, right? With multirow, you are actually going to lock
more than one row (and the lockId null seems kind of a hack around that as it is always null,
so far).

lockId could be passed to use one lock to lock all rows. Not used, yet, but still useful.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java,
line 4214
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4214>
bq.  >
bq.  >     nit: lockID rather than just lid would be slightly descriptive.

getLock is pretty clear, so is rowsToLock.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java,
line 3346
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3346>
bq.  >
bq.  >     Wow, this is ugly. Maybe we should consider some refactoring of this later?

Not only ugly, but also wrong (see 2nd version of the patch). MultiRowMutation does not implement
Row so it cannot be part of a Multi action.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java,
line 272
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72048#file72048line272>
bq.  >
bq.  >     This class can get easily bloated as we add more types. Might be worth considering
refactoring this out into its own test.

See 2nd version of patch.


- Lars

                
      was (Author: jiraposter@reviews.apache.org):
    

bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > A couple of nits and small implementation details, but overall looks pretty good.

You're looking at an old version of the patch. :)


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java,
line 3160
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3160>
bq.  >
bq.  >     But in the comments on the MultiRowMutation you push that checking off onto
the RS, so no checking really happens then (except, I guess when you try to mutate rows on
the region and it fails b/c those rows aren't there, but that seems kinda late for the check).

Checking is happening the region.internalMutate.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java,
line 786
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72037#file72037line786>
bq.  >
bq.  >     I think is this unnecessary, javadoc should handle inheriting the docs.

It's done elsewhere, it is good to call out that no doc was added here, because the interface
has the doc.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java,
line 284
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72038#file72038line284>
bq.  >
bq.  >     or presplitting as is described in other documenttation.

Yes, should add this.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java,
line 35
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line35>
bq.  >
bq.  >     Probably want to wrap NOTE in <b> tags to call it out.

Sure.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java,
line 45
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line45>
bq.  >
bq.  >     A javadoc here might be nice to indicate that the nullary constructor is actually
completely ok to use (as opposed to the more common state of being reserved for readFields).

Good point. Although unless it is called out that you cannot use a constructor there should
be no reason whyt you couldn't.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java,
line 64
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72039#file72039line64>
bq.  >
bq.  >     Even though it uses protected structures doesn't mean that its necessarily thread
safe. In fact, because it is using the standard ArrayList, there is no guarantee of safety.
Either the class should be marked as not thread safe OR the mutations should be wrapped as
a concurrent list.

I disagree.
This is a client side object and none of the client side objects are threadsafe nor should
they be (see Put.java/Delete.java/Increment.java/Append.java/etc), that's the task of client
application.

I misread Ted's comment before, of course this method is not threadsafe.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/RowMutation.java,
line 95
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72040#file72040line95>
bq.  >
bq.  >     You really don't need to keep around the row anymore either because you can
get that from the mutations as you already do mutateRows with MultiRowMutation. Its nice to
store it, but is only going to be checked infrequently and saves you a little bit over the
wire (which could add up, depending on row size).

Same as Put and Delete (where every KV already has the row).
There is room optimization, but this is not the jira to do that.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java,
line 4161
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4161>
bq.  >
bq.  >     Suprised this isn't a utility method in HRegion - it seems really useful. Maybe
worth pulling out for general use.

internalMutate?


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java,
line 4181
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4181>
bq.  >
bq.  >     This isn't actually true, right? With multirow, you are actually going to lock
more than one row (and the lockId null seems kind of a hack around that as it is always null,
so far).

lockId could be passed to use one lock to lock all rows. Not used, yet, but still useful.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java,
line 4214
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72044#file72044line4214>
bq.  >
bq.  >     nit: lockID rather than just lid would be slightly descriptive.

getLock is pretty clear, so is rowsToLock.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java,
line 3346
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72045#file72045line3346>
bq.  >
bq.  >     Wow, this is ugly. Maybe we should consider some refactoring of this later?

Not only ugly, but also wrong (see 2nd version of the patch). MultiRowMutation does not implement
Row so it cannot be part of a Multi action.


bq.  On 2012-02-05 07:26:08, Jesse Yates wrote:
bq.  > http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java,
line 272
bq.  > <https://reviews.apache.org/r/3748/diff/1/?file=72048#file72048line272>
bq.  >
bq.  >     This class can get easily bloated as we add more types. Might be worth considering
refactoring this out into its own test.

See 2nd version of patch.


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3748/#review4822
-----------------------------------------------------------


On 2012-02-03 19:59:55, Lars Hofhansl wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3748/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-02-03 19:59:55)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This builds on HBASE-3584, HBASE-5203, and HBASE-5304.
bq.  
bq.  Multiple Rows can be locked and applied atomically as long as the application ensures
that all rows reside in the same Region (by presplitting or a custom RegionSplitPolicy).
bq.  At SFDC we can use this to colocate subsets of a tenant's data and allow atomic operations
over these subsets.
bq.  
bq.  Obviously this is an advanced features and this prominently called out in the Javadoc.
bq.  
bq.  
bq.  This addresses bug HBASE-5229.
bq.      https://issues.apache.org/jira/browse/HBASE-5229
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTable.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/MultiRowMutation.java
PRE-CREATION 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
1239953 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestAtomicOperation.java
1239953 
bq.  
bq.  Diff: https://reviews.apache.org/r/3748/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Tests added to TestFromClientSide and TestAtomicOperation
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.


                  
> Provide basic building blocks for "multi-row" local transactions.
> -----------------------------------------------------------------
>
>                 Key: HBASE-5229
>                 URL: https://issues.apache.org/jira/browse/HBASE-5229
>             Project: HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 5229-endpoint.txt, 5229-final.txt, 5229-multiRow-v2.txt, 5229-multiRow.txt,
5229-seekto-v2.txt, 5229-seekto.txt, 5229.txt
>
>
> In the final iteration, this issue provides a generalized, public mutateRowsWithLocks
method on HRegion, that can be used by coprocessors to implement atomic operations efficiently.
> Coprocessors are already region aware, which makes this is a good pairing of APIs. This
feature is by design not available to the client via the HTable API.
> It took a long time to arrive at this and I apologize for the public exposure of my (erratic
in retrospect) thought processes.
> Was:
> HBase should provide basic building blocks for multi-row local transactions. Local means
that we do this by co-locating the data. Global (cross region) transactions are not discussed
here.
> After a bit of discussion two solutions have emerged:
> 1. Keep the row-key for determining grouping and location and allow efficient intra-row
scanning. A client application would then model tables as HBase-rows.
> 2. Define a prefix-length in HTableDescriptor that defines a grouping of rows. Regions
will then never be split inside a grouping prefix.
> #1 is true to the current storage paradigm of HBase.
> #2 is true to the current client side API.
> I will explore these two with sample patches here.
> --------------------
> Was:
> As discussed (at length) on the dev mailing list with the HBASE-3584 and HBASE-5203 committed,
supporting atomic cross row transactions within a region becomes simple.
> I am aware of the hesitation about the usefulness of this feature, but we have to start
somewhere.
> Let's use this jira for discussion, I'll attach a patch (with tests) momentarily to make
this concrete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message