cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jasper K." <>
Subject Extracting data from SSTable files with MapReduce
Date Sat, 13 Apr 2013 07:58:07 GMT

Does anyone have any experience with running a MapReduce directly against a
CF's SSTable files?

I have a use case where this seems to be an option. I want to export all
data from a CF to a flat file format for statistical analysis.

Some factors that make it (more) doable in my case:
-The Cassandra instance is not 'on-line' (no writes- no reads)
-The .db files were exported from another instance. I got them all in one
place now

The SSTable files are in the -f- format from 0.8.10.

Looking at this : it
should be possible to write a Hadoop RecordReader for Cassandra rowkeys.

But maybe I am not fully aware of what I am up to.


*Jasper** *

View raw message