incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject rainbird question (why is the 1minute buffer needed?)
Date Fri, 20 May 2011 21:04:53 GMT
(sorry if Rainbird is not a topic relevant enough, I'd appreciate if
someone could point me to a more appropriate venue in that case)


Rainbird buffers up 1 minute worth of events first before writing to Cassandra.

it seems that this extra layer of buffering is repetitive, and could
be avoided : Cassandra's memtable already does buffering, whose
internal implementation is just
Map.put(key, CF ) , I guess rainbird does similar things :
column_to_count = map.get(key); column_to_count++ ; map.put(key,
column_to_count) ??
the "++" part is probably already done by the Distributed Counters in
Cassandra.
then I guess Rainbird layer exists because it needs to parse an
incoming event into various attributes that it is interested in: for
example from an url, we bump up the counts of
FQDN , domain, path etc, Rainbird does the transformation from
url--->3 attrs.

but I guess that transformation might as well be done in the cassandra
JVM itself, if we could provide some hooks, so that a module
translates incoming request into
multiple keys, and bump up their counts. that way we avoid the
intermediate communication from clients to rainbird,  and rainbird to
Cassandra. are there some points I'm missing?

Thanks
Yang

Mime
View raw message