hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject Re: batch update question
Date Wed, 05 Sep 2012 17:01:23 GMT

Hi there, for more information about the hbase client, seeĊ 


On 9/5/12 12:59 PM, "Doug Meil" <doug.meil@explorysmedical.com> wrote:

>Hi there, if you look in the source code for HTable there is a list of Put
>objects.  That's the buffer, and it's a client-side buffer.
>On 9/5/12 12:04 PM, "Lin Ma" <linlma@gmail.com> wrote:
>>Thank you Stack for the details directions!
>>1. You are right, I have not met with any real row contention issues. My
>>purpose is understanding the issue in advance, and also from this issue
>>understand HBase generals better;
>>2. For the comments from API Url page you referred -- "If
>>false, the update is buffered until the internal buffer is full.", I
>>confused what is the buffer? Buffer at client side or buffer in region
>>server? Is there a way to configure its size to hold until flushing?
>>3. Why batch could resolve contention on the same raw issue in theory,
>>compared to non-batch operation? Besides preparation the solution in my
>>mind in advance, I want to learn a bit about why. :-)
>>On Wed, Sep 5, 2012 at 4:00 AM, Stack <stack@duboce.net> wrote:
>>> On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma <linlma@gmail.com> wrote:
>>> > Hello guys,
>>> >
>>> > I am reading the book "HBase, the definitive guide", at the beginning
>>> > chapter 3, it is mentioned in order to reduce performance impact for
>>> > clients to update the same row (lock contention issues for automatic
>>> > write), batch update is preferred. My questions is, for MR job, what
>>> > the batch update methods we could leverage to resolve the issue? And
>>> > API client, what are the batch update methods we could leverage to
>>> resolve
>>> > the issue?
>>> >
>>> Do you actually have a problem where there is contention on a single
>>> Use methods like
>>> or the batch methods listed earlier in the API.  You should set
>>> autoflush to false too:
>>> Even batching, a highly contended row might hold up inserts... but for
>>> sure you actually have this problem in the first place?
>>> St.Ack

View raw message