hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nkeywal (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-6295) Possible performance improvement in client batch operations: presplit and send in background
Date Sat, 30 Jun 2012 11:46:42 GMT
nkeywal created HBASE-6295:
------------------------------

             Summary: Possible performance improvement in client batch operations: presplit
and send in background
                 Key: HBASE-6295
                 URL: https://issues.apache.org/jira/browse/HBASE-6295
             Project: HBase
          Issue Type: Improvement
          Components: client
    Affects Versions: 0.96.0
            Reporter: nkeywal


today batch algo is:
{noformat}
for Operation o: List<Op>{
  add o to todolist
  if todolist > maxsize or o last in list
    split todolist per location
    send split lists to region servers
    clear todolist
    wait
}
{noformat}

We could:
- create immediately the final object instead of an intermediate array
- split per location immediately
- instead of sending when the list as a whole is full, send it when there is enough data for
a single location

It would be:
{noformat}
for Operation o: List<Op>{
  get location
  add o to todo location.todolist
  if (location.todolist > maxLocationSize)
    send location.todolist to region server 
    clear location.todolist
    // don't wait, continue the loop
}
send remaining
wait
{noformat}

It's not trivial to write if you add error management: retried list must be shared with the
operations added in the todolist. But it's doable.
It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message