cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Testing row cache feature in trunk: write should put record in cache
Date Fri, 19 Feb 2010 21:03:14 GMT
mmap is designed to handle that case, yes.  it is already in 0.6 branch.

On Fri, Feb 19, 2010 at 2:44 PM, Weijun Li <weijunli@gmail.com> wrote:
> I see. How much is the overhead of java serialization? Does it slow down the
> system a lot? It seems to be a tradeoff between CPU usage and memory.
>
> As for mmap of 0.6, do you mmap the sstable data file even it is a lot
> larger than the available memory (e.g., the data file is over 100GB while
> you have only 8GB ram)? How efficient is mmap in this case? Is mmap already
> checked into 0.6 branch?
>
> -Weijun
>
> On Fri, Feb 19, 2010 at 4:56 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>
>> The whole point of rowcache is to avoid the serialization overhead,
>> though.  If we just wanted the serialized form cached, we would let
>> the os block cache handle that without adding an extra layer.  (0.6
>> uses mmap'd i/o by default on 64bit JVMs so this is very efficient.)
>>
>> On Fri, Feb 19, 2010 at 3:29 AM, Weijun Li <weijunli@gmail.com> wrote:
>> > The memory overhead issue is not directly related to GC because when JVM
>> > ran
>> > out of memory the GC has been very busy for quite a while. In my case
>> > JVM
>> > consumed all of the 6GB when the row cache size hit 1.4mil.
>> >
>> > I haven't started test the row cache feature yet. But I think data
>> > compression is useful to reduce memory consumption because in my
>> > impression
>> > disk i/o is always the bottleneck for Cassandra while its CPU usage is
>> > usually low all the time. In addition to this, compression should also
>> > help
>> > to reduce the number of java objects dramatically (correct me if I'm
>> > wrong),
>> > --especially in case we need to cache most of the data to achieve decent
>> > read latency.
>> >
>> > If ColumnFamily is serializable it shouldn't be that hard to implement
>> > the
>> > compression feature which can be controlled by an option (again :-) in
>> > storage conf xml.
>> >
>> > When I get to that point you can instruct me to implement this feature
>> > along
>> > with the row-cache-write-through. Our goal is straightforward: to
>> > support
>> > short read latency in high volume web application with write/read ratio
>> > to
>> > be 1:1.
>> >
>> > -Weijun
>> >
>> > -----Original Message-----
>> > From: Jonathan Ellis [mailto:jbellis@gmail.com]
>> > Sent: Thursday, February 18, 2010 12:04 PM
>> > To: cassandra-user@incubator.apache.org
>> > Subject: Re: Testing row cache feature in trunk: write should put record
>> > in
>> > cache
>> >
>> > Did you force a GC from jconsole to make sure you weren't just
>> > measuring uncollected garbage?
>> >
>> > On Wed, Feb 17, 2010 at 2:51 PM, Weijun Li <weijunli@gmail.com> wrote:
>> >> OK I'll work on the change later because there's another problem to
>> >> solve:
>> >> the overhead for cache is too big that 1.4mil records (1k each)
>> >> consumed
>> > all
>> >> of the 6gb memory of JVM (I guess 4gb are consumed by the row cache).
>> >> I'm
>> >> thinking that ConcurrentHashMap is not a good choice for LRU and the
>> >> row
>> >> cache needs to store compressed key data to reduce memory usage. I'll
>> >> do
>> >> more investigation on this and let you know.
>> >>
>> >> -Weijun
>> >>
>> >> On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis <jbellis@gmail.com>
>> >> wrote:
>> >>>
>> >>> ... tell you what, if you write the option-processing part in
>> >>> DatabaseDescriptor I will do the actual cache part. :)
>> >>>
>> >>> On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis <jbellis@gmail.com>
>> >>> wrote:
>> >>> > https://issues.apache.org/jira/secure/CreateIssue!default.jspa,
but
>> >>> > this is pretty low priority for me.
>> >>> >
>> >>> > On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li <weijunli@gmail.com>
>> >>> > wrote:
>> >>> >> Just tried to make quick change to enable it but it didn't
work out
>> > :-(
>> >>> >>
>> >>> >>                ColumnFamily cachedRow =
>> >>> >> cfs.getRawCachedRow(mutation.key());
>> >>> >>
>> >>> >>                 // What I modified
>> >>> >>                 if( cachedRow == null ) {
>> >>> >>                     cfs.cacheRow(mutation.key());
>> >>> >>                     cachedRow =
>> >>> >> cfs.getRawCachedRow(mutation.key());
>> >>> >>                 }
>> >>> >>
>> >>> >>                 if (cachedRow != null)
>> >>> >>                     cachedRow.addAll(columnFamily);
>> >>> >>
>> >>> >> How can I open a ticket for you to make the change (enable
row
>> >>> >> cache
>> >>> >> write
>> >>> >> through with an option)?
>> >>> >>
>> >>> >> Thanks,
>> >>> >> -Weijun
>> >>> >>
>> >>> >> On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis <jbellis@gmail.com>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis
>> >>> >>> <jbellis@gmail.com>
>> >>> >>> wrote:
>> >>> >>> > On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li <weijunli@gmail.com>
>> >>> >>> > wrote:
>> >>> >>> >> Just started to play with the row cache feature
in trunk: it
>> >>> >>> >> seems
>> >>> >>> >> to
>> >>> >>> >> be
>> >>> >>> >> working fine so far except that for RowsCached
parameter you
>> >>> >>> >> need
>> >>> >>> >> to
>> >>> >>> >> specify
>> >>> >>> >> number of rows rather than a percentage (e.g.,
"20%" doesn't
>> > work).
>> >>> >>> >
>> >>> >>> > 20% works, but it's 20% of the rows at server startup.
 So on a
>> >>> >>> > fresh
>> >>> >>> > start that is zero.
>> >>> >>> >
>> >>> >>> > Maybe we should just get rid of the % feature...
>> >>> >>>
>> >>> >>> (Actually, it shouldn't be hard to update this on flush,
if you
>> >>> >>> want
>> >>> >>> to open a ticket.)
>> >>> >>
>> >>> >>
>> >>> >
>> >>
>> >>
>> >
>> >
>
>

Mime
View raw message