incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Robson <mar...@gmail.com>
Subject Re: Access counts (was: The concurrent access problem and solutions)
Date Mon, 30 Nov 2009 09:58:47 GMT
Personally I'd have each server record the access counts itself in local
storage for a while, then push them up to cassandra with a column name which
is unique to that server and push instance. This creates a delay before the
access counts are updated, I assume this is ok.

So we'd see something like

product21_access_count:{'server1_time12345678':42,
'server2_time123124':99,'server3_time123127385'} ...

Now someone who wants to know the exact count can just read the entire row
and add them up.

Of course over time, this will consume more and more storage, so a
summarisation process (which you run just one instance of, or have a
protocol for avoiding trying to summarise the same items at once) can come
along and consolidate them into a single count then you get:

product21_access_count:{'total':141}

And if any of the individual servers were pushing more data at the same
time, that's fine too.

Mark

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message