cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Haddad <>
Subject Re: Cassandra sort using updatable query
Date Wed, 12 Nov 2014 18:37:27 GMT
With Cassandra you're going to want to model tables to meet the
requirements of your queries instead of like a relational database where
you build tables in 3NF then optimize after.

For your optimized select query, your table (with caveat, see below) could
start out as:

create table words (
  year int,
  frequency int,
  content text,
  primary key (year, frequency, content) );

You may want to maintain other tables as well for different types of select

Your UPDATE statement above won't work, you'll have to DELETE and INSERT,
since you can't change the value of a clustering column.  If you don't know
what your old frequency is ahead of time (to do the delete), you'll need to
keep another table mapping content,year -> frequency.

Now, the tricky part here is that the above model will limit the total
number of partitions you've got to the number of years you're working with,
and will not scale as the cluster increases in size.  Ideally you could
bucket frequencies.  If that feels like too much work (it's starting to for
me), this may be better suited to something like solr, elastic search, or
DSE (cassandra + solr).

Does that help?


On Wed Nov 12 2014 at 9:01:44 AM Chamila Wijayarathna <> wrote:

> Hello all,
> I have a data set with attributes content and year. I want to put them in
> to CF 'words' with attributes ('content','year','frequency'). The CF should
> support following operations.
>    - Frequency attribute of a column can be updated (i.e. - : can run
>    query like "UPDATE words SET frequency = 2 WHERE content='abc' AND
>    year=1990;), where clause should contain content and year
>    - Should support select query like "Select content from words where
>    year = 2010 ORDER BY frequency DESC LIMIT 10;" (where clause only has year)
>    where results can be ordered using frequency
> Is this kind of requirement can be fulfilled using Cassandra? What is the
> CF structure and indexing I need to use here? What queries should I use to
> create CF and in indexing?
> Thank You!
> --
> *Chamila Dilshan Wijayarathna,*
> Undergraduate,
> Department of Computer Science and Engineering,
> University of Moratuwa.

View raw message