incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <tsalora...@gmail.com>
Subject Re: Testing row cache feature in trunk: write should put record in cache
Date Sat, 20 Feb 2010 20:12:20 GMT
On Fri, Feb 19, 2010 at 11:44 AM, Weijun Li <weijunli@gmail.com> wrote:
> I see. How much is the overhead of java serialization? Does it slow down the
> system a lot? It seems to be a tradeoff between CPU usage and memory.

This should be relatively easy to measure, as a stand-alone thing. Or
maybe even from profiler stack traces
 If native Java serialization is used, there may be more efficient
alternatives, depending on data -- default serialization is highly
inefficient for small object graphs (like individual objects), but ok
for larger graphs; this because much of class metadata is included,
result is very self-contained.
Beyond default serialization, there are more efficient general-purpose
Java serialization frameworks; like Kryo or fast(est) json-based
serializers (jackson); see
[http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking]
for some idea on alternatives.

In fact: one interesting idea would be to further trade some CPU for
less memory by using fast compression (like LZF). I hope to experiment
with this idea some time in future. But challenge is that this would
help most with clustered scheme (compressing more than one distinct
item), which is much trickier to make work. Compression does ok with
individual items, but real boost comes from redundancy between similar
items.

-+ Tatu +-

Mime
View raw message