incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Extracting data from SSTable files with MapReduce
Date Tue, 16 Apr 2013 21:33:25 GMT
> I did try to upgrade to 1.2 but it did not work out. Maybe to many versions in between.
Newer versions should be able to read older file formats. What was the error?

> Why would later formats make this easier you think?
it will be easier to write against the current code base and you find it easier to get help.


Cheers


-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/04/2013, at 5:37 AM, Jasper K. <jasper.knulst@incentro.com> wrote:

> Hi Aaron,
> 
> I did try to upgrade to 1.2 but it did not work out. Maybe to many versions in between.
> 
> Why would later formats make this easier you think?
> 
> Jasper
> 
> 
> 
> 2013/4/14 aaron morton <aaron@thelastpickle.com>
>> The SSTable files are in the -f- format from 0.8.10.
> 
> If you can upgrade to the latest version it will make things easier. 
> Start a node and use nodetool upgradesstables. 
> 
> The org.apache.cassandra.tools.SSTableExport class provides a blue print for reading
rows from disk.
> 
> hope that helps. 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 13/04/2013, at 7:58 PM, Jasper K. <jasper.knulst@incentro.com> wrote:
> 
>> Hi,
>> 
>> Does anyone have any experience with running a MapReduce directly against a CF's
SSTable files?
>> 
>> I have a use case where this seems to be an option. I want to export all data from
a CF to a flat file format for statistical analysis.
>> 
>> Some factors that make it (more) doable in my case:
>> -The Cassandra instance is not 'on-line' (no writes- no reads)
>> -The .db files were exported from another instance. I got them all in one place now
>> 
>> The SSTable files are in the -f- format from 0.8.10.
>> 
>> Looking at this : http://wiki.apache.org/cassandra/ArchitectureSSTable it should
be possible to write a Hadoop RecordReader for Cassandra rowkeys.
>> 
>> But maybe I am not fully aware of what I am up to.
>> 
>> -- 
>> 
>> Jasper 
> 
> 
> 
> 
> -- 
> 


Mime
View raw message