Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 89F5EF904 for ; Thu, 21 Mar 2013 07:51:50 +0000 (UTC) Received: (qmail 95037 invoked by uid 500); 21 Mar 2013 07:51:48 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 94580 invoked by uid 500); 21 Mar 2013 07:51:47 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 94550 invoked by uid 99); 21 Mar 2013 07:51:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2013 07:51:47 +0000 X-ASF-Spam-Status: No, hits=1.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_SOFTFAIL,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of andras.szerdahelyi@ignitionone.com does not designate 216.82.251.5 as permitted sender) Received: from [216.82.251.5] (HELO mail1.bemta12.messagelabs.com) (216.82.251.5) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2013 07:51:39 +0000 Received: from [216.82.249.131:30192] by server-5.bemta-12.messagelabs.com id 45/67-15423-6FBBA415; Thu, 21 Mar 2013 07:51:18 +0000 X-Env-Sender: andras.szerdahelyi@ignitionone.com X-Msg-Ref: server-11.tower-28.messagelabs.com!1363852277!24234410!1 X-Originating-IP: [208.52.173.250] X-StarScan-Received: X-StarScan-Version: 6.8.6.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 20596 invoked from network); 21 Mar 2013 07:51:17 -0000 Received: from mail.dentsunetwork.com (HELO mail.dentsunetwork.com) (208.52.173.250) by server-11.tower-28.messagelabs.com with AES128-SHA encrypted SMTP; 21 Mar 2013 07:51:17 -0000 Received: from ATL02MB02.corp.local ([fe80::7997:c980:b031:df37]) by ATL02HUB02.corp.local ([::1]) with mapi id 14.02.0318.004; Thu, 21 Mar 2013 03:51:17 -0400 From: Andras Szerdahelyi To: "user@cassandra.apache.org" Subject: =?iso-8859-2?Q?Re:_index=5Finterval_memory_savings_in_our_case(if_you_are?= =?iso-8859-2?Q?_curious)=A9_(and_performance_result)...?= Thread-Topic: =?iso-8859-2?Q?index=5Finterval_memory_savings_in_our_case(if_you_are_cur?= =?iso-8859-2?Q?ious)=A9_(and_performance_result)...?= Thread-Index: AQHOJWoWvDmLKmhvUEWcVpIVEoEOpJiu3imA///5h4CAAUMFgA== Date: Thu, 21 Mar 2013 07:51:17 +0000 Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.2.130206 x-originating-ip: [10.0.90.4] Content-Type: text/plain; charset="iso-8859-2" Content-ID: <62323E0D0022DE4E98AE0DCAA2DCC5E4@corp.local> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Wow. SO LCS with bloom filter fp chance of 0.1 and an index sampling rate of 512 on a column family of 1.7billion rows each node yields 100% result on first sstable reads? That sounds amazing. And I assume this is cfhistograms output from a node that has been on 512 for a while? ( I still think its unlikely 1.2x re-samples sstables on startup -- I'm on on 1.1x though ) For LCS, same fp chance and sampling rate, with 300-500mil rows per node ( 300-400GB ) on 1.1x my sstable reads for a single read got pretty much out of control. On 20/03/13 14:35, "Hiller, Dean" wrote: >I am using LCS so bloom filter fp default for 1.2.2 is 0.1 so my >bloomfilter size is 1.27G RAM(nodetool cfstats)....1.7 billion rows each >node. > >My cfstats for this CF is attached(Since cut and paste screwed up the >formatting). During testing in QA, we were not sure if index_interval >change was working so we dug into the code to find out, it basically seems >to immediately convert on startup though doesn't log anything except at a >"debug" level which we don't have on. > >Dean > > > >On 3/20/13 6:58 AM, "Andras Szerdahelyi" > wrote: > >>I am curious, thanks. ( I am in the same situation, big nodes choking >>under 300-400G data load, 500mil keys ) >> >>How does your "cfhistograms Keyspace CF" output look like? How many >>sstable reads ? >>What is your bloom filter fp chance ? >> >>Regards, >>Andras >> >>On 20/03/13 13:54, "Hiller, Dean" wrote: >> >>>Oh, and to give you an idea of memory savings, we had a node at 10G RAM >>>usage...we had upped a few nodes to 16G from 8G as we don't have our new >>>nodes ready yet(we know we should be at 8G but we would have a dead >>>cluster if we did that). >>> >>>On startup, the initial RAM is around 6-8G. Startup with >>>index_interval=3D512 resulted in a 2.5G-2.8G initial RAM and I have seen >>>it >>>grow to 3.3G and back down to 2.8G. We just rolled this out an hour >>>ago. >>>Our website response time is the same as before as well. >>> >>>We rolled to only 2 nodes(out of 6) in our cluster so far to test it out >>>and let it soak a bit. We will slowly roll to more nodes monitoring the >>>performance as we go. Also, since dynamic snitch is not working with >>>SimpleSnitch, we know that just one slow node affects our website(from >>>personal pain/experience of nodes hitting RAM limit and slowing down >>>causing website to get real slow). >>> >>>Dean >>> >>>On 3/20/13 6:41 AM, "Andras Szerdahelyi" >>> wrote: >>> >>>>2. Upping index_interval from 128 to 512 (this seemed to reduce our >>>>memory >>>>usage significantly!!!) >>>> >>>> >>>>I'd be very careful with that as a one-stop improvement solution for >>>>two >>>>reasons AFAIK >>>>1) you have to rebuild stables ( not an issue if you are evaluating, >>>>doing >>>>test writes.. Etc, not so much in production ) >>>>2) it can affect reads ( number of sstable reads to serve a read ) >>>>especially if your key/row cache is ineffective >>>> >>>>On 20/03/13 13:34, "Hiller, Dean" wrote: >>>> >>>>>Also, look at the cassandra logs. I bet you see the typical=A9blah bl= ah >>>>>is >>>>>at 0.85, doing memory cleanup which is not exactly GC but cassandra >>>>>memory >>>>>management=A9..and of course, you have GC on top of that. >>>>> >>>>>If you need to get your memory down, there are multiple ways >>>>>1. Switching size tiered compaction to leveled compaction(with 1 >>>>>billion >>>>>narrow rows, this helped us quite a bit) >>>>>2. Upping index_interval from 128 to 512 (this seemed to reduce our >>>>>memory >>>>>usage significantly!!!) >>>>>3. Just add more nodes as moving the rows to other servers reduces >>>>>memory >>>>>from #1 and #2 above since the server would have less rows >>>>> >>>>>Later, >>>>>Dean >>>>> >>>>>On 3/20/13 6:29 AM, "Andras Szerdahelyi" >>>>> wrote: >>>>> >>>>>> >>>>>>I'd say GC. Please fill in form CASS-FREEZE-001 below and get back to >>>>>>us >>>>>>:-) ( sorry ) >>>>>> >>>>>>How big is your JVM heap ? How many CPUs ? >>>>>>Garbage collection taking long ? ( look for log lines from >>>>>>GCInspector) >>>>>>Running out of heap ? ( "heap is .. full" log lines ) >>>>>>Any tasks backing up / being dropped ? ( nodetool tpstats and ".. >>>>>>dropped >>>>>>in last .. ms" log lines ) >>>>>>Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily >>>>>>) >>>>>> >>>>>>How much is lots of data? Wide or skinny rows? Mutations/sec ? >>>>>>Which Compaction Strategy are you using? Output of show schema ( >>>>>>cassandra-cli ) for the relevant Keyspace/CF might help as well >>>>>> >>>>>>What consistency are you doing your writes with ? I assume ONE or ANY >>>>>>if >>>>>>you have a single node. >>>>>> >>>>>>What are the values for these settings in cassandra.yaml >>>>>> >>>>>>memtable_total_space_in_mb: >>>>>>memtable_flush_writers: >>>>>>memtable_flush_queue_size: >>>>>>compaction_throughput_mb_per_sec: >>>>>> >>>>>>concurrent_writes: >>>>>> >>>>>> >>>>>> >>>>>>Which version of Cassandra? >>>>>> >>>>>> >>>>>> >>>>>>Regards, >>>>>>Andras >>>>>> >>>>>>From: Joel Samuelsson >>>>>>Reply-To: "user@cassandra.apache.org" >>>>>>Date: Wednesday 20 March 2013 13:06 >>>>>>To: "user@cassandra.apache.org" >>>>>>Subject: Cassandra freezes >>>>>> >>>>>> >>>>>>Hello, >>>>>> >>>>>>I've been trying to load test a one node cassandra cluster. When I >>>>>>add >>>>>>lots of data, the Cassandra node freezes for 4-5 minutes during which >>>>>>neither reads nor writes are served. >>>>>>During this time, Cassandra takes 100% of a single CPU core. >>>>>>My initial thought was that this was Cassandra flushing memtables to >>>>>>the >>>>>>disk, however, the disk i/o is very low during this time. >>>>>>Any idea what my problem could be? >>>>>>I'm running in a virtual environment in which I have no control of >>>>>>drives. >>>>>>So commit log and data directory is (probably) on the same drive. >>>>>> >>>>>>Best regards, >>>>>>Joel Samuelsson >>>>>> >>>>> >>>> >>> >> >