hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Schubert Zhang <zson...@gmail.com>
Subject Re: HBase commit autoflush
Date Thu, 13 Aug 2009 02:15:04 GMT
The current writeBuffer is a ArrayList, maybe use a sorted list can get more
performance?

On Thu, Aug 13, 2009 at 5:06 AM, Jonathan Gray <jlist@streamy.com> wrote:

> That is right.
>
> If this is the case, and you are not under tight memory constraints
> client-side, you can significantly increase the size of your write buffer.
>
> JG
>
>
> Schubert Zhang wrote:
>
>> When set autoflush=false, the client-side write buffer is fine to work.
>> But we may not get better performance when random inserting.
>> Since the
>>
>> org.apache.hadoop.hbase.client.HConnectionManager.TableServers.processBatchOfRows()
>> will still send the random rows to different server respectively.
>>
>>
>> In the book:OReilly.Hadoop.The.Definitive.Guide.June.2009
>> <--
>> By default, each HTable.commit(BatchUpdate) actually performs the insert
>> without
>> any buffering. You can disable HTable auto-flush feature using
>> HTable.setAuto
>> Flush(false) and then set the size of configurable write buffer. When the
>> inserts
>> committed fill the write buffer, it is then flushed. Remember though, you
>> must call
>> a manual HTable.flushCommits() at the end of each task to ensure that
>> nothing is
>> left unflushed in the buffer. You could do this in an override of the
>> mapper’s
>> close() method.
>> -->
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message