incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Rothfeder <jamie.rothfe...@gmail.com>
Subject Hadoop Integration: Limiting scan to a range of keys
Date Sat, 01 Dec 2012 02:04:37 GMT
Hey All,

I have a bunch of time-series data stored in a cluster using a
ByteOrderedPartitioner. My keys are time buckets representing events that
occurred in an hour. I've been trying to write a mapreduce job that
considers only events with in a certain time range by specifying an input
range, but this doesn't seem to be working.

I expect the following code to scan data for a single key (1353456000), but
it is scanning all keys.

int key = 1353456000;
IPartitioner part =
ConfigHelper.getInputPartitioner(job.getConfiguration());
Token token =  part.getToken(ByteBufferUtil.bytes(key));
ConfigHelper.setInputRange(job.getConfiguration(),
part.getTokenFactory().toString(token),
part.getTokenFactory().toString(token));

Any idea what I'm doing wrong?

Thanks,
Jamie

Mime
View raw message