hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Full table scan from random starting point?
Date Sat, 01 Feb 2014 01:46:14 GMT
Hi Robert,

You can randomly build your start key, give it to your scanner, scan until
the end of the table, then give it as the end key for a new scanner. Doing
that you will scan the way you are looking for.

Also, this might interest you:
https://issues.apache.org/jira/browse/HBASE-9272

JM


2014-01-31 Robert Dyer <rdyer@iastate.edu>:

> Let's say I have one client on each of my regionservers.  Each client needs
> to do a full scan on the same table.  The order in which the rows are
> scanned by clients does not matter.
>
> Is it possible to have each client start at a random (or better, the first
> row located on the local rs) point in the table so that if I start all of
> them at once they don't all peg the same rs for reads?
>
> Example (to keep it simple, assume 3 RS):
>
> RS1: rows 1-2
> RS2: rows 3-4
> RS3: rows 5-6
>
> client1 (on RS1) reads rows: 1, 2, 3, 4, 5, 6
> client2 (on RS2) reads rows: 3, 4, 5, 6, 1, 2
> client3 (on RS3) reads rows: 5, 6, 1, 2, 3, 4
>
> Obviously they may progress at different rates and still wind up hitting
> the same RSs, but at least we can start out a bit more distributed.
>
> Is this easily possible, without first obtaining a list of all rows and
> manually batching them up?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message