hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2353) HBASE-2283 removed bulk sync optimization for multi-row puts
Date Fri, 11 Jun 2010 18:02:15 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877891#action_12877891
] 

HBase Review Board commented on HBASE-2353:
-------------------------------------------

Message from: stack@duboce.net


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1442
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1442>
bq.  >
bq.  >     Is this a copy?
bq.  
bq.  Todd Lipcon wrote:
bq.      Yep. I did it the simple way following the pseudocode in the JIRA, but we can definitely
do it more efficiently by indexing into a list or arraylist. My bet is that the efficiency
gains are marginal compared to the cost of actually writing to the WAL, etc, and this patch
should already be a big gain over what we used to have.
bq.      
bq.      After this is committed let's file a followup JIRA to remove extra shallow copies.
I'll add a TODO in the code also.

Fine by me


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1467
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1467>
bq.  >
bq.  >     You were going to replace these w/ something from guava (or is this it?)
bq.  
bq.  Todd Lipcon wrote:
bq.      The guava Preconditions thing is supposed to be more for parameter checking, etc,
than for internal assertions. Currently these don't get run, but I think we should enable
-ea at least for unit tests.

NM.  Just saw this "By default, Surefire enables JVM assertions for the execution of your
test cases."  So, keep your asserts as is (we should all take them on)


bq.  On 2010-06-11 09:59:29, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java, line 1482
bq.  > <http://review.hbase.org/r/167/diff/1/?file=1237#file1237line1482>
bq.  >
bq.  >     This creates new Map, pass in Map.Entry instead?
bq.  
bq.  Todd Lipcon wrote:
bq.      I thought both .keySet() and .entrySet() were just views into the existing map? (and
p.getFamilyMap() just returns a member)

Looks like I'm wrong, at least regards JDK1.6.  Internally it seems to use entrySet.  Below
is from java.util.AbstractMap:


    public Set<K> keySet() {
    if (keySet == null) {
        keySet = new AbstractSet<K>() {
        public Iterator<K> iterator() {
            return new Iterator<K>() {
            private Iterator<Entry<K,V>> i = entrySet().iterator();
....


- stack


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/167/#review181
-----------------------------------------------------------





> HBASE-2283 removed bulk sync optimization for multi-row puts
> ------------------------------------------------------------
>
>                 Key: HBASE-2353
>                 URL: https://issues.apache.org/jira/browse/HBASE-2353
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>            Assignee: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2353_def_log_flush.patch
>
>
> previously to HBASE-2283 we used to call flush/sync once per put(Put[]) call (ie: batch
of commits).  Now we do for every row.  
> This makes bulk uploads slower if you are using WAL.  Is there an acceptable solution
to achieve both safety and performance by bulk-sync'ing puts?  Or would this not work in face
of atomic guarantees?
> discuss!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message