hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Ma <lin...@gmail.com>
Subject Re: batch update question
Date Thu, 06 Sep 2012 15:54:11 GMT
Thank you Doug,

Very effective reply. :-)

- why batch update could resolve contention issue on the same row? Could
you elaborate a bit more or show me an example?
- Batch update always have good performance compared to single update (when
we measure total throughput)?

regards,
Lin

On Thu, Sep 6, 2012 at 12:59 AM, Doug Meil <doug.meil@explorysmedical.com>wrote:

>
> Hi there, if you look in the source code for HTable there is a list of Put
> objects.  That's the buffer, and it's a client-side buffer.
>
>
>
>
>
> On 9/5/12 12:04 PM, "Lin Ma" <linlma@gmail.com> wrote:
>
> >Thank you Stack for the details directions!
> >
> >1. You are right, I have not met with any real row contention issues. My
> >purpose is understanding the issue in advance, and also from this issue to
> >understand HBase generals better;
> >2. For the comments from API Url page you referred -- "If
> >isAutoFlush<
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client
> >/HTableInterface.html#isAutoFlush%28%29>is
> >false, the update is buffered until the internal buffer is full.", I
> >am
> >confused what is the buffer? Buffer at client side or buffer in region
> >server? Is there a way to configure its size to hold until flushing?
> >3. Why batch could resolve contention on the same raw issue in theory,
> >compared to non-batch operation? Besides preparation the solution in my
> >mind in advance, I want to learn a bit about why. :-)
> >
> >regards,
> >Lin
> >
> >On Wed, Sep 5, 2012 at 4:00 AM, Stack <stack@duboce.net> wrote:
> >
> >> On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma <linlma@gmail.com> wrote:
> >> > Hello guys,
> >> >
> >> > I am reading the book "HBase, the definitive guide", at the beginning
> >>of
> >> > chapter 3, it is mentioned in order to reduce performance impact for
> >> > clients to update the same row (lock contention issues for automatic
> >> > write), batch update is preferred. My questions is, for MR job, what
> >>are
> >> > the batch update methods we could leverage to resolve the issue? And
> >>for
> >> > API client, what are the batch update methods we could leverage to
> >> resolve
> >> > the issue?
> >> >
> >>
> >> Do you actually have a problem where there is contention on a single
> >>row?
> >>
> >> Use methods like
> >>
> >>
> >>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.htm
> >>l#put(java.util.List)
> >> or the batch methods listed earlier in the API.  You should set
> >> autoflush to false too:
> >>
> >>
> >>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTableInte
> >>rface.html#isAutoFlush()
> >>
> >> Even batching, a highly contended row might hold up inserts... but for
> >> sure you actually have this problem in the first place?
> >>
> >> St.Ack
> >>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message