cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Solovyov <>
Subject Re: Seeking suggestions for a use case
Date Tue, 12 Feb 2013 20:09:03 GMT
Thanks for your suggestions and feedbacks! We will see how it goes. I am
trying to set up first test cluster now :)

On Tue, Feb 12, 2013 at 10:08 AM, Edward Capriolo <>wrote:

> Your use case is 100% on the money for Cassandra. But let me take a
> chance to slam the other NoSQLs. (not really slam but you know)
> Riak is a key-value store. It is not a column family store where a
> rowkey has a map of sorted values. This makes the time series more
> awkward as the time series has to span many rows, rather then one
> large row.
> HBase has similiar problems with time-series. On one hand if your
> rowkeys are series you get hotspots, if you columns are time series
> you run into two subtle issues. Last I check hbase's on disk format
> repeats the key each time (somewhat wasteful)
> key,column,value
> key,column,value
> key,column,value
> Also there are issues with really big rows, although they are dealt
> with in a similiar way to really wide rows in cassandra, just use time
> as part of the row key and the rows will not get that large.
> I do not think you need leveled compaction for an append only
> workload, although it might be helpful depending on how long you want
> to keep these rows. If you are not keeping them very long possibly
> leveled would keep the on disk size smaller.
> Column TTLs in cassandra do not require extra storage. It is a very
> efficient way to do this. Otherwise you have to scan through your data
> with some offline process and delete.
> Do not worry about gc_grace to much. The moral is because of
> distributed deletes some data lives on disk for a while after it is
> deleted. All this means is you need "some" more storage then just the
> space for your live data.
> Don't use row cache with wide rows REPEAT Don't use row cache with wide
> rows
> Compaction throughput is metered on each node (again not a setting to
> worry about
> if you are hitting flush_largest_memtables_at and
> reduce_cache_capacity_to it basically means your have over tuned or
> you do not have enough hardware. These are mostly emergency valves and
> if you are setup well these are not a factor. They are only around to
> relieve memory pressure to prevent the node from hitting a cycle where
> it is in GC more then it is in serving mode.
> Whew!
> Anyway. Nice to see that you are trying to understand the knobs,
> before kicking the tires.
> On Tue, Feb 12, 2013 at 5:55 AM, Boris Solovyov
> <> wrote:
> > Hello list!
> >
> > I have application with following characteristics:
> >
> > data is time series, tens of millions of series at 1-sec granularity,
> like
> > stock ticker data
> > values are timestamp, integer (uint64)
> > data is append only, never update
> > data don't write in distant past, maybe sometimes write 10 sec ago but
> not
> > more
> > data is write mostly, like 99.9% write I think
> > most read will be of recent data, always in range of timestamps
> > data needs purge after some time, ex. 1 week
> >
> > I consider to use Cassandra. No other existing database (HBase, Riak,
> etc)
> > seems well suited for this.
> >
> > Questions:
> >
> > Did I miss some others database that could work? Please suggest me if you
> > know one.
> > What are benefits or drawbacks of leveled compaction for this workload?
> > Setting column TTL seems bad choice due to extra storage. Agree? Is
> > efficient to run routine batch job to purge oldest data? Is there will be
> > any gotcha with that (like fullscan of something instead of just oldest,
> > maybe?)
> > Will column index beneficial? If reads are scans, does it matter, or is
> it
> > just extra work and storage space to maintain, without much benefit
> > especially since reads are rare?
> > How gc_grace_seconds impacts operations in this workload? Will purges of
> old
> > data leave sstables mostly obsolete, rather than sparsely obsolete? I
> think
> > they will. So, after purge, tombstones can be GC shortly, no need for
> > default 10 days grace period. BUT, I read in docs that if
> gc_grace_seconds
> > is short, then nodetool repair needs run quite often. Is that true? Why
> > would that be needed in my use case?
> > Related question: is it sensible to set tombstone_threshold to 1.0 but
> > tombstone_compaction_interval to something short, like 1 hour? I suppose
> > this depends on whether I am correct that SSTables will be deleted
> entirely,
> > instead of just getting sparse.
> > Should I disable row_cache_provider? It invalidates every row on update,
> > right? I will be updating rows constantly, so it seems not benefitial.
> > Docs say "compaction_throughput_mb_per_sec" is per "entire system." Does
> > that mean per NODE, or per ENTIRE CLUSTER? Will this cause trouble with
> > periodic deletions of expired columns? Do I need to make sure my purges
> of
> > old data are trickled out over time to avoid huge overhead of compaction?
> > But in that case, SSTables will become sparsely deleted, right? And then
> > re-compacted, which seems wasteful if the remaining data will soon be
> purged
> > again and there will be another re-compaction. So this is partially why I
> > asked about tombstone-threshold and compaction interval -- I think is
> best
> > if I can purge data in such a way that Cassandra never recompacts
> SsTables,
> > but just realizes "oh, whole thing is dead, I can delete, no work
> needed."
> > But I am not sure if my considered settings will have unintended
> > consequence.
> > Finally, with proposed workload, will there be troubles with
> > flush_larges_memtables_at and reduce_cache_capacity_to,
> > reduce_cache_sizes_at? These are describe as "emergency measures" in
> docs.
> > If my workload is edge case that could trigger bad emergency-measure
> > behavior I hope you can say me that :-)
> >
> > Many thanks!
> >
> > Boris

View raw message