cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Ashley <>
Subject Re: Data Modeling Conundrum
Date Mon, 10 May 2010 18:09:37 GMT
Yeah, I intentionally didn't mention the expected data set size, hoping I could find a more
elegant solution that would work both in the small N and large N cases. In any case, I appreciate
the recommendations.

When I get some time I am interested in looking at the source and figuring out whether or
not getting a "Most/least recently updated" ordering for columns would be doable.

On May 8, 2010, at 4:12 PM, Ed Anuff wrote:

> I was thinking it was going to be a lot more than that, you might want to consider just
storing them all as a single serialized array of timestamps and uuids.  By my math, you could
fit up to 40 uuid/timestamp pairs for under 1K.  Then you'd just store something like this:
> // Row key is userId
> 12345 : {
>   last_seen : 387587235233, // timestamp of last visit
>   last_uuid: ‘256fb890-5a4b-11df-a08a-0800200c9a66’,
>   history : 0x000....., // serialized array of N timestamp/uuid pairs (24 bytes per pair)
> }
> On Sat, May 8, 2010 at 3:54 PM, William Ashley <> wrote:
> That is a good question, because realistically I see N being under 10, and there are
no current plans to make use of a large historical record. I could have the update process
pull all columns and issue deletes as necessary such that only M (M >= N) are kept.
> Thanks for the inspiration.
> On May 8, 2010, at 3:42 PM, Ed Anuff wrote:
>> Sorry, missed that.  I'm not sure if there's a cleaner way than using the approaches
you've looked at, hopefully someone else has an answer.  How big is N and do you need to keep
more than N around?
>> On Sat, May 8, 2010 at 10:26 AM, William Ashley <> wrote:
>> This would be a solution if I wanted to get the N most recently CREATED guids, but
I'm interested in the most recently SEEN guids.

View raw message