hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchDelete()
Date Tue, 03 Jul 2012 01:33:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405487#comment-13405487
] 

Anoop Sam John commented on HBASE-6284:
---------------------------------------

@Ted
bq.Can we buffer Appends and Increments so that mutates List contains certain amount of Mutate's
?

This will actually delay the calls for Appends and Increments right?  Suppose the List comes
is as Put(r1),Put(r2),Increment(r1),Put(r1)  [I dont know some one use this way]
In he current way of Trunk code as we try to maintain the seq the value corresponding to (
suppose only one CF and qualifier) r1 will be that from the last Put. But if we buffer the
Increment and allow the Puts to happen first, the final value may come as different!
Basically we might loose the seq..  I am not sure because of this reason we changed the code
in trunk and which issue changed this code.. I am just thinking this may be the reason.  What
do u say?  Correct me if my understanding is wrong pls..
                
> Introduce HRegion#doMiniBatchDelete()
> -------------------------------------
>
>                 Key: HBASE-6284
>                 URL: https://issues.apache.org/jira/browse/HBASE-6284
>             Project: HBase
>          Issue Type: Bug
>          Components: performance, regionserver
>            Reporter: Zhihong Ted Yu
>            Assignee: Anoop Sam John
>             Fix For: 0.96.0, 0.94.2
>
>         Attachments: 6284-trunk-suggest.txt, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch,
HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List<Delete>) groups the Deletes for the same RS and make one
n/w call only. But within the RS, there will be N number of delete calls on the region one
by one. This will include N number of HLog write and sync. If this also can be grouped can
we get better performance for the multi row delete.
> I have made the new miniBatchDelete () and made the HTable#delete(List<Delete>)
to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting a performance
boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on the table
from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message