Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BB0CF919F for ; Thu, 16 Feb 2012 21:58:11 +0000 (UTC) Received: (qmail 47319 invoked by uid 500); 16 Feb 2012 21:58:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 47243 invoked by uid 500); 16 Feb 2012 21:58:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 47230 invoked by uid 99); 16 Feb 2012 21:58:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Feb 2012 21:58:09 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_NEUTRAL,TO_NO_BRKTS_PCNT X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.149.67] (HELO na3sys009aog101.obsmtp.com) (74.125.149.67) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 16 Feb 2012 21:58:02 +0000 Received: from mail-lpp01m010-f52.google.com ([209.85.215.52]) (using TLSv1) by na3sys009aob101.postini.com ([74.125.148.12]) with SMTP ID DSNKTz170stssfGk1Pv9Vj9tic75Fr6v7y5v@postini.com; Thu, 16 Feb 2012 13:57:41 PST Received: by mail-lpp01m010-f52.google.com with SMTP id y4so4901126lag.39 for ; Thu, 16 Feb 2012 13:57:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.49.67 with SMTP id s3mr1625355lbn.92.1329429456706; Thu, 16 Feb 2012 13:57:36 -0800 (PST) Received: by 10.112.26.199 with HTTP; Thu, 16 Feb 2012 13:57:36 -0800 (PST) Received: by 10.112.26.199 with HTTP; Thu, 16 Feb 2012 13:57:36 -0800 (PST) In-Reply-To: References: Date: Fri, 17 Feb 2012 08:57:36 +1100 Message-ID: Subject: Re: Key cache hit rate issue From: Franc Carter To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d0401676f855b9f04b91bead1 X-Gm-Message-State: ALoCoQmqS5/ziJSF3djUimuWb0c+7MVG/aYBhl6ED0CSsDAIGc/JNrefGZB7w/CCIgLWmONZaWpJ X-Virus-Checked: Checked by ClamAV on apache.org --f46d0401676f855b9f04b91bead1 Content-Type: text/plain; charset=ISO-8859-1 On 17/02/2012 8:53 AM, "Eran Chinthaka Withana" wrote: > > Hi Jonathan, > > Thanks for the reply. Yes there is a possibility that the keys can be distributed in multiple SSTables, but my data access patterns are such that I always read/write the whole row. So I expect all the data to be in the same SSTable (please correct me if I'm wrong). > > For some reason 16637958 (the keys cached) has become a golden number and I don't see key cache increasing beyond that. I also checked memory and I have about 4GB left in JVM memory and didn't see any issues on logs. I have seen the same thing with the keycache size becoming static cheers > > Thanks, > Eran Chinthaka Withana > > > > On Thu, Feb 16, 2012 at 1:20 PM, Jonathan Ellis wrote: >> >> So, you have roughly 1/6 of your (physical) row keys cached and about >> 1/4 cache hit rate, which doesn't sound unreasonable to me. Remember, >> each logical key may be spread across multiple physical sstables -- >> each (key, sstable) pair is one entry in the key cache. >> >> On Thu, Feb 16, 2012 at 1:48 PM, Eran Chinthaka Withana >> wrote: >> > Hi Aaron, >> > >> > Here it is. >> > >> > Keyspace: XXXX >> > Read Count: 1123637972 >> > Read Latency: 5.757938114343114 ms. >> > Write Count: 128201833 >> > Write Latency: 0.0682576607387509 ms. >> > Pending Tasks: 0 >> > Column Family: YY >> > SSTable count: 18 >> > Space used (live): 103318720685 >> > Space used (total): 103318720685 >> > Number of Keys (estimate): 92404992 >> > Memtable Columns Count: 1425580 >> > Memtable Data Size: 359655747 >> > Memtable Switch Count: 2522 >> > Read Count: 1123637972 >> > Read Latency: 14.731 ms. >> > Write Count: 128201833 >> > Write Latency: NaN ms. >> > Pending Tasks: 0 >> > Bloom Filter False Postives: 1488 >> > Bloom Filter False Ratio: 0.00000 >> > Bloom Filter Space Used: 331522920 >> > Key cache capacity: 16637958 >> > Key cache size: 16637958 >> > Key cache hit rate: 0.2708333333333333 >> > Row cache: disabled >> > Compacted row minimum size: 51 >> > Compacted row maximum size: 6866 >> > Compacted row mean size: 2560 >> > >> > Thanks, >> > Eran Chinthaka Withana >> > >> > >> > >> > On Thu, Feb 16, 2012 at 12:30 AM, aaron morton >> > wrote: >> >> >> >> Its in the order of 261 to 8000 and the ratio is 0.00. But i guess 8000 is >> >> bit high. Is there a way to fix/improve it? >> >> >> >> Sorry I don't understand what you mean. But if the ratio is 0.0 all is >> >> good. >> >> >> >> Could you include the full output from cfstats for the CF you are looking >> >> at ? >> >> >> >> Cheers >> >> >> >> ----------------- >> >> Aaron Morton >> >> Freelance Developer >> >> @aaronmorton >> >> http://www.thelastpickle.com >> >> >> >> On 15/02/2012, at 1:00 PM, Eran Chinthaka Withana wrote: >> >> >> >> Its in the order of 261 to 8000 and the ratio is 0.00. But i guess 8000 is >> >> bit high. Is there a way to fix/improve it? >> >> >> >> Thanks, >> >> Eran Chinthaka Withana >> >> >> >> >> >> On Tue, Feb 14, 2012 at 3:42 PM, aaron morton >> >> wrote: >> >>> >> >>> Out of interest what does cfstats say about the bloom filter stats ? A >> >>> high false positive could lead to a low key cache hit rate. >> >>> >> >>> Also, is there a way to warm start the key cache, meaning pre-load the >> >>> amount of keys I set as keys_cached? >> >>> >> >>> See key_cache_save_period when creating the CF. >> >>> >> >>> Cheers >> >>> >> >>> >> >>> ----------------- >> >>> Aaron Morton >> >>> Freelance Developer >> >>> @aaronmorton >> >>> http://www.thelastpickle.com >> >>> >> >>> On 15/02/2012, at 5:54 AM, Eran Chinthaka Withana wrote: >> >>> >> >>> Hi, >> >>> >> >>> I'm using Cassandra 1.0.7 and I've set the keys_cached to about 80% >> >>> (using the numerical values). This is visible in cfstats too. But I'm >> >>> getting less than 20% (or sometimes even 0%) key cache hit rate. Well, the >> >>> data access pattern is not the issue here as I know they are retrieving the >> >>> same row multiple times. I'm using hector client with dynamic load balancing >> >>> policy with consistency ONE for both reads and writes. Any ideas on how to >> >>> find the issue and fix this? >> >>> >> >>> Here is what I see on cfstats. >> >>> >> >>> Key cache capacity: 16637958 >> >>> Key cache size: 16637958 >> >>> Key cache hit rate: 0.045454545454545456 >> >>> >> >>> Also, is there a way to warm start the key cache, meaning pre-load the >> >>> amount of keys I set as keys_cached? >> >>> >> >>> Thanks, >> >>> Eran >> >>> >> >>> >> >> >> >> >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com > > --f46d0401676f855b9f04b91bead1 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


On 17/02/2012 8:53 AM, "Eran Chinthaka Withana" <eran.chinthaka@gmail.com> wrote:
>
> Hi Jonathan,
>
> Thanks for the reply.=A0Yes there is a possibility that the keys can b= e distributed in multiple SSTables, but my data access patterns are such th= at I always read/write the whole row. So I expect all the data to be in the= same SSTable (please correct me if I'm wrong).=A0
>
> For some reason=A016637958 (the keys cached) has become a golden numbe= r and I don't see key cache increasing beyond that. I also checked memo= ry and I have about 4GB left in JVM memory and didn't see any issues on= logs.=A0

I have seen the same thing with the keycache size becoming static

cheers

>
> Thanks,
> Eran Chinthaka Withana
>
>
>
> On Thu, Feb 16, 2012 at 1:20 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>
>> So, you have roughly 1/6 of your (physical) row keys cached and ab= out
>> 1/4 cache hit rate, which doesn't sound unreasonable to me. = =A0Remember,
>> each logical key may be spread across multiple physical sstables -= -
>> each (key, sstable) pair is one entry in the key cache.
>>
>> On Thu, Feb 16, 2012 at 1:48 PM, Eran Chinthaka Withana
>> <eran.chinthaka@gma= il.com> wrote:
>> > Hi Aaron,
>> >
>> > Here it is.
>> >
>> > Keyspace: XXXX
>> > Read Count: 1123637972
>> > Read Latency: 5.757938114343114 ms.
>> > Write Count: 128201833
>> > Write Latency: 0.0682576607387509 ms.
>> > Pending Tasks: 0
>> > Column Family: YY
>> > SSTable count: 18
>> > Space used (live): 103318720685
>> > Space used (total): 103318720685
>> > Number of Keys (estimate): 92404992
>> > Memtable Columns Count: 1425580
>> > Memtable Data Size: 359655747
>> > Memtable Switch Count: 2522
>> > Read Count: 1123637972
>> > Read Latency: 14.731 ms.
>> > Write Count: 128201833
>> > Write Latency: NaN ms.
>> > Pending Tasks: 0
>> > Bloom Filter False Postives: 1488
>> > Bloom Filter False Ratio: 0.00000
>> > Bloom Filter Space Used: 331522920
>> > Key cache capacity: 16637958
>> > Key cache size: 16637958
>> > Key cache hit rate: 0.2708333333333333
>> > Row cache: disabled
>> > Compacted row minimum size: 51
>> > Compacted row maximum size: 6866
>> > Compacted row mean size: 2560
>> >
>> > Thanks,
>> > Eran Chinthaka Withana
>> >
>> >
>> >
>> > On Thu, Feb 16, 2012 at 12:30 AM, aaron morton <aaron@thelastpickle.com>
>> > wrote:
>> >>
>> >> Its in the order of 261 to 8000 and the ratio is 0.00. Bu= t i guess 8000 is
>> >> bit high. Is there a way to fix/improve it?
>> >>
>> >> Sorry I don't understand what you mean. But if the ra= tio is 0.0 all is
>> >> good.
>> >>
>> >> Could you include the full output from cfstats for the CF= you are looking
>> >> at ?
>> >>
>> >> Cheers
>> >>
>> >> -----------------
>> >> Aaron Morton
>> >> Freelance Developer
>> >> @aaronmorton
>> >> http://www.thela= stpickle.com
>> >>
>> >> On 15/02/2012, at 1:00 PM, Eran Chinthaka Withana wrote:<= br> >> >>
>> >> Its in the order of 261 to 8000 and the ratio is 0.00. Bu= t i guess 8000 is
>> >> bit high. Is there a way to fix/improve it?
>> >>
>> >> Thanks,
>> >> Eran Chinthaka Withana
>> >>
>> >>
>> >> On Tue, Feb 14, 2012 at 3:42 PM, aaron morton <aaron@thelastpickle.com>
>> >> wrote:
>> >>>
>> >>> Out of interest what does cfstats say about the bloom= filter stats ? A
>> >>> high false positive could lead to a low key cache hit= rate.
>> >>>
>> >>> Also, is there a way to warm start the key cache, mea= ning pre-load the
>> >>> amount of keys I set as keys_cached?
>> >>>
>> >>> See=A0key_cache_save_period when creating the CF.
>> >>>
>> >>> Cheers
>> >>>
>> >>>
>> >>> -----------------
>> >>> Aaron Morton
>> >>> Freelance Developer
>> >>> @aaronmorton
>> >>> http://www.t= helastpickle.com
>> >>>
>> >>> On 15/02/2012, at 5:54 AM, Eran Chinthaka Withana wro= te:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I'm using Cassandra 1.0.7 and I've set the ke= ys_cached to about 80%
>> >>> (using the numerical values). This is visible in cfst= ats too. But I'm
>> >>> getting less than 20% (or sometimes even 0%) key cach= e hit rate. Well, the
>> >>> data access pattern is not the issue here as I know t= hey are retrieving the
>> >>> same row multiple times. I'm using hector client = with dynamic load balancing
>> >>> policy with consistency ONE for both reads and writes= . Any ideas on how to
>> >>> find the issue and fix this?
>> >>>
>> >>> Here is what I see on cfstats.
>> >>>
>> >>> Key cache capacity: 16637958
>> >>> Key cache size: 16637958
>> >>> Key cache hit rate: 0.045454545454545456
>> >>>
>> >>> Also, is there a way to warm start the key cache, mea= ning pre-load the
>> >>> amount of keys I set as keys_cached?
>> >>>
>> >>> Thanks,
>> >>> Eran
>> >>>
>> >>>
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra supp= ort
>> http://www.datastax.com >
>

--f46d0401676f855b9f04b91bead1--