hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Reopened] (HBASE-4143) HTable.doPut(List) should check the writebuffer length every so often
Date Thu, 28 Jul 2011 23:50:09 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Gary Helmling reopened HBASE-4143:

The fix committed for this issue unfortunately causes a new performance problem when autoflush==true.

In this case, the effect is to call flushCommits() every 10 items in the List<Put>.
 This effectively disables the ability to do batching.

For the check internal to the for loop over List<Put>, we should only be checking if
currentWriteBufferSize > writeBufferSize.  The check on autoflush should only come after
the for loop.

> HTable.doPut(List) should check the writebuffer length every so often
> ---------------------------------------------------------------------
>                 Key: HBASE-4143
>                 URL: https://issues.apache.org/jira/browse/HBASE-4143
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Doug Meil
>            Assignee: Doug Meil
>            Priority: Minor
>         Attachments: client_HBASE_4143.patch
> This came up on a dist-list conversation between Andy P., Ted Yu, and myself.  Andy noted
that extremely large lists passed into put(List) can cause issues.  Ted suggested that having
doPut check the write-buffer length every so often (5-10 records?) so the flush doesn't happen
only at the end, and I think that's good idea.
>  public void put(final List<Put> puts) throws IOException {
>     doPut(puts);
>   }
>   private void doPut(final List<Put> puts) throws IOException {
>     for (Put put : puts) {
>       validatePut(put);
>       writeBuffer.add(put);
>       currentWriteBufferSize += put.heapSize();
>     }
>     if (autoFlush || currentWriteBufferSize > writeBufferSize) {
>       flushCommits();
>     }
>   }
> Once this change is made, remove the comment in HBASE-4142 about large lists being a
performance problem.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message