Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 893576AF4 for ; Thu, 30 Jun 2011 16:03:49 +0000 (UTC) Received: (qmail 90801 invoked by uid 500); 30 Jun 2011 16:03:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 90752 invoked by uid 500); 30 Jun 2011 16:03:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 90744 invoked by uid 99); 30 Jun 2011 16:03:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Jun 2011 16:03:45 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daniel.doubleday@gmx.net designates 213.165.64.23 as permitted sender) Received: from [213.165.64.23] (HELO mailout-de.gmx.net) (213.165.64.23) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 30 Jun 2011 16:03:38 +0000 Received: (qmail invoked by alias); 30 Jun 2011 16:03:16 -0000 Received: from p578bde86.dip0.t-ipconnect.de (EHLO caladan.smeet.de) [87.139.222.134] by mail.gmx.net (mp037) with SMTP; 30 Jun 2011 18:03:16 +0200 X-Authenticated: #3445653 X-Provags-ID: V01U2FsdGVkX19dFQUD8/1dYbMiqVN08AoZ2237MqczXFHsjhhZ/t 9W5EtEEiK5b/5f From: Daniel Doubleday Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-1-430113872 Subject: Re: Row cache Date: Thu, 30 Jun 2011 18:03:15 +0200 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: <8A7E912E-9429-4FF2-8EA7-6FD4C2E5BA26@gmx.net> X-Mailer: Apple Mail (2.1084) X-Y-GMX-Trusted: 0 --Apple-Mail-1-430113872 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Here's my understanding of things ... (this applies only for the regular = heap implementation of row cache) > Why Cassandra does not cache a row that was requested few times?=20 What does the cache capacity read. Is it > 0? > What the ReadCount attribute in ColumnFamilies indicates and why it = remains zero.=20 Hm I had that too one time (read count wont go up while there were = reads). But I didn't have the time to debug. > How can I know from where Cassandra read a row (from MEMTable,RowCache = or SSTable)?=20 It will always read from=20 row cache or=20 memtable(s) and sstable(s) jmx should tell you (hits go up) > does the following correct? In read operation Cassandra looks for the = row in the MEMTable - if not found it looks in the row-cache - if not = found it looks in SSTable (after looking in the key-cache to optimize = the access to the SSTable)?=20 No.=20 If row cache capacity is > 0 then a read will check if the row is in = cache if not it read the entire row and cache it. Then / or if row was = in cache already it will read from there and apply the respective filter = to the cached CF. Writes update memtable and row cache when the row is = cached. I must admit that I still dont quite understand why there's no = race here. I haven't found any cache lock. So someone else should = explain why a concurrent read / write cannot produce a lost update in = the cached row. If capacity is 0 then it will read from the current memtable, the = memtable(s) that are being flushed and all sstables that may contain the = row (filtered by bloom filter) Hope that's correct and helps. Cheers, Daniel --Apple-Mail-1-430113872 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii Here's my understanding of things ... (this applies only for the regular heap implementation of row cache)

Why Cassandra does not cache a row that was requested few times?

What does the cache capacity read. Is it > 0?

What the ReadCount attribute in ColumnFamilies indicates and why it remains zero.

Hm I had that too one time (read count wont go up while there were reads). But I didn't have the time to debug.

How can I know from where Cassandra read a row (from MEMTable,RowCache or SSTable)?

It will always read from 
row cache or 
memtable(s) and sstable(s)

jmx should tell you (hits go up)

does the following correct? In read operation Cassandra looks for the row in the MEMTable - if not found it looks in the row-cache - if not found it looks in SSTable (after looking in the key-cache to optimize the access to the SSTable)?

No. 

If row cache capacity is > 0 then a read will check if the row is in cache if not it read the entire row and cache it. Then / or if row was in cache already it will read from there and apply the respective filter to the cached CF.   Writes update memtable and row cache when the row is cached. I must admit that I still dont quite understand why there's no race here. I haven't found any cache lock. So someone else should explain why a concurrent read / write cannot produce a lost update in the cached row.

If capacity is 0 then it will read from the current memtable, the memtable(s) that are being flushed and all sstables that may contain the row (filtered by bloom filter)

Hope that's correct and helps.

Cheers,
Daniel

--Apple-Mail-1-430113872--