cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Testing row cache feature in trunk: write should put record in cache
Date Sat, 20 Feb 2010 20:20:09 GMT
We don't use native java serialization for anything but the on-disk
BitSets in our bloom filters (because those are deserialized once at
startup, so the overhead doesn't matter), btw.

We're talking about adding compression after

On Sat, Feb 20, 2010 at 3:12 PM, Tatu Saloranta <> wrote:
> On Fri, Feb 19, 2010 at 11:44 AM, Weijun Li <> wrote:
>> I see. How much is the overhead of java serialization? Does it slow down the
>> system a lot? It seems to be a tradeoff between CPU usage and memory.
> This should be relatively easy to measure, as a stand-alone thing. Or
> maybe even from profiler stack traces
>  If native Java serialization is used, there may be more efficient
> alternatives, depending on data -- default serialization is highly
> inefficient for small object graphs (like individual objects), but ok
> for larger graphs; this because much of class metadata is included,
> result is very self-contained.
> Beyond default serialization, there are more efficient general-purpose
> Java serialization frameworks; like Kryo or fast(est) json-based
> serializers (jackson); see
> []
> for some idea on alternatives.
> In fact: one interesting idea would be to further trade some CPU for
> less memory by using fast compression (like LZF). I hope to experiment
> with this idea some time in future. But challenge is that this would
> help most with clustered scheme (compressing more than one distinct
> item), which is much trickier to make work. Compression does ok with
> individual items, but real boost comes from redundancy between similar
> items.
> -+ Tatu +-

View raw message