hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: Dealing with large data sets in client
Date Wed, 28 Mar 2012 17:47:33 GMT
Thanks Stack, that's correct.  It is kind of hard to describe, though I
guess it's easiest to think of it as a 2d array where the 2nd dimension is

I think your idea would be doable, too.  I'm going to try testing them both
and see how well they perform.  Luckily I'm not TOO concerned about
performance for these outliers, as long as having multiple big scanners
like that open at once doesn't degrade performance for other queries as
well.  I'll update with my findings incase someone else ends up with a
similar use case.

On Wed, Mar 28, 2012 at 1:10 PM, Stack <stack@duboce.net> wrote:

> On Tue, Mar 27, 2012 at 2:36 PM, Bryan Beaudreault
> <bbeaudreault@hubspot.com> wrote:
> > I imagine it isn't a great idea to create a ton of scans
> > (1 for each row), which is the only way I can think to do the above with
> > what we have.
> >
> You want to step through some set of rows in lock-step?  That is, get
> first N on row A, then first N on row B, etc., then when that is done,
> go back and step through next N on A, B, and so on?
> (Pardon me if I'm being a bit thick -- its early here)
> I know of no way to do this other than as you suggest -- a scanner per
> row (not too bad given your rows are wide) or what about a scan to do
> first N, then a new scan to do next N... would that work?
> St.Ack

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message