hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Dyer <rd...@iastate.edu>
Subject Full table scan from random starting point?
Date Fri, 31 Jan 2014 22:17:18 GMT
Let's say I have one client on each of my regionservers.  Each client needs
to do a full scan on the same table.  The order in which the rows are
scanned by clients does not matter.

Is it possible to have each client start at a random (or better, the first
row located on the local rs) point in the table so that if I start all of
them at once they don't all peg the same rs for reads?

Example (to keep it simple, assume 3 RS):

RS1: rows 1-2
RS2: rows 3-4
RS3: rows 5-6

client1 (on RS1) reads rows: 1, 2, 3, 4, 5, 6
client2 (on RS2) reads rows: 3, 4, 5, 6, 1, 2
client3 (on RS3) reads rows: 5, 6, 1, 2, 3, 4

Obviously they may progress at different rates and still wind up hitting
the same RSs, but at least we can start out a bit more distributed.

Is this easily possible, without first obtaining a list of all rows and
manually batching them up?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message