incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Han <schumi....@gmail.com>
Subject Re: improving read performance
Date Wed, 22 Sep 2010 03:46:33 GMT
> Reasons to not use the row cache with large rows include:
>
> * In general it's a waste of memory better given to the OS page cache,
> unless possibly you're continually reading entire rows rather than
> subsets of rows.
>
> * For truly large rows you may have immediate issues with the size of
> the data being cached; e.g. attempting to cache a 2 GB row is not the
> best idea in terms of heap space consumption; you'll likely OOM or
> trigger fallbacks to full GC, etc.
>
> * Having a larger key cache may often be more productive.
>
> > That aside, splitting the memtable in 2, could make checking the bloom
> > filters unnecessary in most cases for me, but I'm not sure it's worth the
> > effort.
>
> Write-through row caching seems like a more direct approach to me
> personally, off hand. Also to the extent that you're worried about
> false positive rates, larger bloom filters may still be an option (not
> currently configurable; would require source changes).
>
> IMHO, it's very difficult to tune JVM when JVM caches a lot of data for a
long time because
the modern GC does not design for such purpose.

There is a patch about make the row cache pluggable to be replaced by
memcached[1]. This is likely
the right way to go.

[1]https://issues.apache.org/jira/browse/CASSANDRA-1283


> --
> / Peter Schuller
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message