hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-748) Add an efficient way to batch update many rows
Date Thu, 25 Sep 2008 14:49:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634501#action_12634501
] 

Jean-Daniel Cryans commented on HBASE-748:
------------------------------------------

{quote}
I see you've made BatchUpdate comparable already. Scratch my suggestion above. I like the
fact that you can ask a BatchUpdate its size. Should size include row and column sizes too
to cover the case where they are really large?
{quote}

Ok!

{quote}
I'd been thinking that doing batch updates, we'd take out the splitsAndClosesLock so splits
wouldn't happen. Would this be easier if you moved the array of batch update handlings down
into HRegion rather than run it in HRS?
{quote}

I've been thinking about this too, glad to know you see it like that. Yeah, I should change
that.

Regards the parens nitpick, sorry it was a mindless copy/paste on my part (look at the trunk
version of commit with a single BU).

bq. I don't follow the above J-D? Are you saying that we can't do batching with 880?

It is not in HBASE-880 scope to add batching, it's only to provide a more modifiable API to
later add batching (this jira). Current patch needs refactoring to abstract the code in flush
so that we can give it a get/delete/commit/etc.

Your two last comments are related. Currently we do most of the validation server-side, moving
this in the client uncovers the problem that HTable doesn't know anything about the table
it handles. So this is a design issue, where should we do all validations? My proposition:
do it client-side, maybe in a helper class. Then the code in the server-side will be cleaner
and we won't do RPCs for nothing and multiple validations in the case of retries. This should
be in another jira.

> Add an efficient way to batch update many rows
> ----------------------------------------------
>
>                 Key: HBASE-748
>                 URL: https://issues.apache.org/jira/browse/HBASE-748
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 0.1.3, 0.2.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.19.0
>
>         Attachments: hbase-748-v1.patch
>
>
> HBASE-747 introduced a simple way to batch update many rows. The goal of this issue is
to have an enhanced version that will send many rows in a single RPC to each region server.
To do this, the client code will have to figure which rows goes to which server, group them
accordingly and then send them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message