It isn’t natively supported but theres some things you can do if need it.
A lot depends on how frequently this list is getting updated. For heavier workloads I would recommend using a custom CF for this instead of collections. If extreme inserts you would want to add additional partitioning to it as well. As mentioned below Id recommend having a cleanup MR job to periodically clean it up if the cost of TTLs possibly leading to 0 entries is too expensive. Putting it in its own CF helps in that it removes the elements of the list from polluting your users partition. If there gets to be a lot of tombstones/inserts this could make reading the user bad (it would look like queue which has horrible performance) so it will at least section off that badness from the regular user lookups.
CREATE TABLE user_top_places (
PRIMARY KEY (user_id, created))
WITH CLUSTERING ORDER BY (created DESC);
then to add a new one to the front of the “list”
INSERT INTO user_top_places (user_id, created, place) VALUES ('frodo', now(), 'mordor’);
and you can see the last 10 entries
SELECT * FROM user_top_places WHERE user_id = 'frodo' LIMIT 10;
This will give you the last 10 entries (allows duplicates though). Older records will still be around though and disk space could eventually become a problem for you. If it becomes bad I would recommend using a periodic job like hadoop to remove excess columns (solely to save disk space). Although if can afford the disk it would give better performance if just let it grow to a point (providing rows don’t get too large, i.e. >64mb). If this isn’t very high in writes there might be some more clever things you can do...
If not having duplicates is more important then you can set “place” as your column name:
CREATE TABLE user_top_places (user_id varchar, place varchar, created timestamp, PRIMARY KEY (user_id, place));
INSERT INTO user_top_places (user_id, place, created) VALUES ('frodo', 'mordor', dateof(now()));
but the results won’t be in order of latest inserted so might have to do some client side filtering to show the latest only using the created field.
look at the collection type support in cql3,
we can append or remove using "+" and "-" operator
SET top_places = top_places + [ 'mordor' ] WHERE user_id = 'frodo';
SET top_places = top_places - ['riddermark'] WHERE user_id = 'frodo';
is there a way to keep a fixed size of the list(collection) ?
I was thinking about using TTL to remove older data after certain time but then the list will become too big if the ttl is too long, and if ttl is too short I running the risk of having a empty list(if there is no new activity).
Even if I don't use collection type and have my own table, I still ran into the same issue.
Any recommendation to handle this type of situation?