hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2066) Perf: parallelize puts
Date Fri, 12 Feb 2010 03:05:28 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832821#action_12832821
] 

ryan rawson commented on HBASE-2066:
------------------------------------

i ran TestBatchPut for a while and inserted 3.3GB of data w/o problems. Ended up with like
4 table splits. No more concurrent exceptions, no major slowdown... the threads got slower
as my machine bogged down, but it wasnt some crazy kind of exponential slowdown originally
reported. 

if there is no complaints, i'm going to commit this as-is.

> Perf: parallelize puts
> ----------------------
>
>                 Key: HBASE-2066
>                 URL: https://issues.apache.org/jira/browse/HBASE-2066
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2066-3.patch, HBASE-2066-branch.patch, HBASE-2066-v2.patch,
TestBatchPut.java
>
>
> Right now with large region count tables, the write buffer is not efficient.  This is
because we issue potentially N RPCs, where N is the # of regions in the table.  When N gets
large (lets say 1200+) things become sloowwwww.
> Instead if we batch things up using a different RPC and use thread pools, we could see
higher performance!
> This requires a RPC change...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message