hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imran M Yousuf <imyou...@gmail.com>
Subject Re: Best way to get multiple non-sequential rows
Date Wed, 25 Aug 2010 04:00:40 GMT
Hi Jonathan,

On Wed, Aug 25, 2010 at 9:52 AM, Jonathan Gray <jgray@facebook.com> wrote:
> Michael,
>
> MultiGet is about performing a set of Get operations in parallel from the client.  So
it buys you potential performance benefits from the concurrency/distribution of your operations.
>
> Roughly, you would bucket the gets according to their region and regionserver.  Then
spawn a thread for each RS and fire off the Gets concurrently.
>
> If I have 100 Gets to perform on a random set of keys, assuming each get takes 10ms,
doing them sequentially will take 1 second.  Other factors and RS concurrency aside, with
MultiGet on a 10 node cluster, the total time would be reduced to 100ms. With 50 nodes, 20ms.

The MultiGet functionality does seem great and that is exactly what I
am looking for! So waiting eagerly to see it make way into the client
API.

/Imran

>
> JG
>
>
>> -----Original Message-----
>> From: Michael Segel [mailto:michael_segel@hotmail.com]
>> Sent: Tuesday, August 24, 2010 7:53 PM
>> To: user@hbase.apache.org
>> Subject: RE: Best way to get multiple non-sequential rows
>>
>>
>> Igor,
>>
>> What does this really buy you?
>>
>> I'm trying to figure out a use case that would show a benefit from just
>> fetching the rows individually. Since the rows are not contiguous, the
>> odds of the next row you want being in cache are going to slight to
>> most likely not. ;-)
>>
>> Can you give a use case where having a 'multi-get' will make life
>> easier?
>>
>> Thx
>>
>> -Mike
>>
>>
>> > Date: Wed, 25 Aug 2010 07:17:13 +0600
>> > Subject: Re: Best way to get multiple non-sequential rows
>> > From: imyousuf@gmail.com
>> > To: user@hbase.apache.org
>> >
>> > Thanks Igor, I will have a look at it.
>> >
>> > /Imran
>> >
>> > On Tue, Aug 24, 2010 at 10:36 PM, Igor Ranitovic <iranitov@gmail.com>
>> wrote:
>> > > Take a look at
>> > > https://issues.apache.org/jira/browse/HBASE-1845
>> > >
>> > > As an HBase user, multi gets is something that I have been looking
>> forward
>> > > to for some time now. If there is enough interest it would be great
>> if this
>> > > becomes part of 0.90.
>> > >
>> > > Take care,
>> > > i.
>> > >
>> > > Imran M Yousuf wrote:
>> > >>
>> > >> Hi,
>> > >>
>> > >> I am using the HBase client API to interact with HBase. I have
>> noticed
>> > >> that HTableInterface has operations such as put(List<Put>),
>> > >> delete(List<Delete>), but there is no similar method for Get.
>> Using
>> > >> scan it is possible to load a range of rows, i.e. sequential rows.
>> My
>> > >> question is -
>> > >> how would it be most efficient to load N non-sequential rows?
>> > >>
>> > >> Currently I am using get(Get) method N times.
>> > >>
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> > Imran M Yousuf
>> > Blog: http://imyousuf-tech.blogs.smartitengineering.com/
>> > Mobile: +880-1711402557
>>
>



-- 
Imran M Yousuf
Blog: http://imyousuf-tech.blogs.smartitengineering.com/
Mobile: +880-1711402557

Mime
View raw message