cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Row or Supercolumn with approximately n columns
Date Mon, 02 Jan 2012 21:24:20 GMT
Even if you had compaction enforcing a limit on the number of columns in a row, there would
still be issues with concurrent writes at the same time and with read-repair. i.e. node a
says the this is the first n columns but node b says something else, you only know who is
correct at read time.

Have you considered using a TTL on the columns ? 

Depending on the use case you could also consider have writes periodically or randomly trim
the data size, or trim on reads. 

It will also make sense to partition the time series data into different rows, and Viva la
Standard Column Families!

Hope that helps. 
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/12/2011, at 7:48 PM, Praveen Baratam wrote:

> Hello Everybody,
> 
> Happy Christmas.
> 
> I know that this topic has come up quiet a few times on Dev and User lists but did not
culminate into a solution.
> 
> http://www.mail-archive.com/user@cassandra.apache.org/msg15367.html
> 
> The above discussion on User list talks about AbstractCompactionStrategy but I could
not find any relevant documentation as its a fairly new feature in Cassandra.
> 
> Let me state this necessity and use-case again.
> 
> I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to approximately limit
the number of columns to "n". "n" can vary a lot and the intention is to throw away stale
data and not to maintain any hard limit on the CF or SC. Its very useful for storing time-series
data where stale data is not necessary. The goal is to achieve this with minimum overhead
and since compaction happens all the time it would be clever to implement it as part of compaction.
> 
> Thanks in advance.
> 
> Praveen


Mime
View raw message