hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2353) HBASE-2283 removed bulk sync optimization for multi-row puts
Date Fri, 11 Jun 2010 17:26:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877866#action_12877866
] 

HBase Review Board commented on HBASE-2353:
-------------------------------------------

Message from: "Todd Lipcon" <todd@cloudera.com>


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1439
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1439>
bq.  >
bq.  >     Should this be public?  Isn't it just used internally?

good call, will make it package private


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1442
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1442>
bq.  >
bq.  >     Is this a copy?

Yep. I did it the simple way following the pseudocode in the JIRA, but we can definitely do
it more efficiently by indexing into a list or arraylist. My bet is that the efficiency gains
are marginal compared to the cost of actually writing to the WAL, etc, and this patch should
already be a big gain over what we used to have.

After this is committed let's file a followup JIRA to remove extra shallow copies. I'll add
a TODO in the code also.


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1467
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1467>
bq.  >
bq.  >     You were going to replace these w/ something from guava (or is this it?)

The guava Preconditions thing is supposed to be more for parameter checking, etc, than for
internal assertions. Currently these don't get run, but I think we should enable -ea at least
for unit tests.


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1482
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1482>
bq.  >
bq.  >     This creates new Map, pass in Map.Entry instead?

I thought both .keySet() and .entrySet() were just views into the existing map? (and p.getFamilyMap()
just returns a member)


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1777
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1777>
bq.  >
bq.  >     w can never be null here?  (There was null check previous)

yea, since it's assigned first in the try block, and that function doesn't throw exceptions.


- Todd


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/167/#review181
-----------------------------------------------------------





> HBASE-2283 removed bulk sync optimization for multi-row puts
> ------------------------------------------------------------
>
>                 Key: HBASE-2353
>                 URL: https://issues.apache.org/jira/browse/HBASE-2353
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>            Assignee: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2353_def_log_flush.patch
>
>
> previously to HBASE-2283 we used to call flush/sync once per put(Put[]) call (ie: batch
of commits).  Now we do for every row.  
> This makes bulk uploads slower if you are using WAL.  Is there an acceptable solution
to achieve both safety and performance by bulk-sync'ing puts?  Or would this not work in face
of atomic guarantees?
> discuss!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message