incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: optimizing use of sstableloader / SSTableSimpleUnsortedWriter
Date Mon, 27 Aug 2012 08:19:10 GMT
> After thinking about how
> sstables are done on disk, it seems best (required??) to write out
> each row at once.  
Sort of. We only want one instance of the row per SSTable created. 


> Any other tips to improve load time or reduce the load on the cluster
> or subsequent compaction activity? 

Less SSTables means less compaction. So go as high as you can on the bufferSizeInMB param
for the 
SSTableSimpleUnsortedWriter. 

There is also a SSTableSimpleWriter. Because it expects rows to be ordered it does not buffer
and can create bigger sstables.
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableSimpleWriter.java


> Right now my Cassandra data store has about 4 months of data and we
> have 5 years of historical 
ingest all the histories!

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/08/2012, at 12:56 PM, Aaron Turner <synfinatic@gmail.com> wrote:

> So I've read: http://www.datastax.com/dev/blog/bulk-loading
> 
> Are there any tips for using sstableloader /
> SSTableSimpleUnsortedWriter to migrate time series data from a our old
> datastore (PostgreSQL) to Cassandra?  After thinking about how
> sstables are done on disk, it seems best (required??) to write out
> each row at once.  Ie: if each row == 1 years worth of data and you
> have say 30,000 rows, write one full row at a time (a full years worth
> of data points for a given metric) rather then 1 data point for 30,000
> rows.
> 
> Any other tips to improve load time or reduce the load on the cluster
> or subsequent compaction activity?   All my CF's I'll be writing to
> use compression and leveled compaction.
> 
> Right now my Cassandra data store has about 4 months of data and we
> have 5 years of historical (not sure yet how much we'll actually load
> yet, but minimally 1 years worth).
> 
> Thanks!
> 
> -- 
> Aaron Turner
> http://synfin.net/         Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
>    -- Benjamin Franklin
> "carpe diem quam minimum credula postero"


Mime
View raw message