incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Extracting data from SSTable files with MapReduce
Date Sun, 14 Apr 2013 18:42:53 GMT
> The SSTable files are in the -f- format from 0.8.10.
If you can upgrade to the latest version it will make things easier. 
Start a node and use nodetool upgradesstables. 

The org.apache.cassandra.tools.SSTableExport class provides a blue print for reading rows
from disk.

hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/04/2013, at 7:58 PM, Jasper K. <jasper.knulst@incentro.com> wrote:

> Hi,
> 
> Does anyone have any experience with running a MapReduce directly against a CF's SSTable
files?
> 
> I have a use case where this seems to be an option. I want to export all data from a
CF to a flat file format for statistical analysis.
> 
> Some factors that make it (more) doable in my case:
> -The Cassandra instance is not 'on-line' (no writes- no reads)
> -The .db files were exported from another instance. I got them all in one place now
> 
> The SSTable files are in the -f- format from 0.8.10.
> 
> Looking at this : http://wiki.apache.org/cassandra/ArchitectureSSTable it should be possible
to write a Hadoop RecordReader for Cassandra rowkeys.
> 
> But maybe I am not fully aware of what I am up to.
> 
> -- 
> 
> Jasper 


Mime
View raw message