Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA72CD9E0 for ; Sun, 2 Dec 2012 21:36:33 +0000 (UTC) Received: (qmail 1871 invoked by uid 500); 2 Dec 2012 21:36:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 1806 invoked by uid 500); 2 Dec 2012 21:36:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 1796 invoked by uid 99); 2 Dec 2012 21:36:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Dec 2012 21:36:30 +0000 X-ASF-Spam-Status: No, hits=0.2 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.136.218.173] (HELO nm11-vm6.bullet.mail.gq1.yahoo.com) (98.136.218.173) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Dec 2012 21:36:21 +0000 Received: from [98.137.12.189] by nm11.bullet.mail.gq1.yahoo.com with NNFMP; 02 Dec 2012 21:36:00 -0000 Received: from [208.71.42.211] by tm10.bullet.mail.gq1.yahoo.com with NNFMP; 02 Dec 2012 21:36:00 -0000 Received: from [127.0.0.1] by smtp222.mail.gq1.yahoo.com with NNFMP; 02 Dec 2012 21:36:00 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1354484160; bh=uHOln/ZO+xz+mc0e9emwIda4RsjscNdnMCrFyCdBr1E=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:Subject:Content-Type:Content-Transfer-Encoding; b=mknX9lCieGGyOoXw9Or78PUtgycNAQWErjvqSLhg4fHY6qSCa/f2p9oml68Mv6sNO4cR8SaKnpoG05g4cos2xU0TG5L24tzqIp/NaKnHaaOT2Tfn+fb6qURRUFhEsZFkGV56yYaQgumOPl6f+/9NAPZ4XfdeQ+kFfIz026tfD88= X-Yahoo-Newman-Id: 358318.19381.bm@smtp222.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: RbeCia4VM1mTGIWYt9YWsA77np2YG3Ni7vkB6FHma2s57pc hWvkEnlraShHidseoMf4c8KqtP2k90s3d0UaIBsRLaa0u8G4Q3bQ9o2FCrle KGl23rXuCuQNG.CJOKNCYpse.WB4CIaXI_.YrnE8dlsw0twT2enKpITtHQQh EOcVXf.CAdiYpdcy8yeykufNTi9AbNzoC_A4V.frmBtljfSeSoF5j6XRnzYC Cckh7GS7_qr5yf9tr1WstAX.YPMsCFHOgJFOTRtjj5USivH8USy8rVV1GQXu G.qvSm7Qyghr1b.U2lpbGc32nL7RbT4yYNEtkD3i26TsV..9xxb7RU8PptVd Dpy6tRagb.88OUebuYZeMeaDSBoY66rZrwnWsjhdabEkZHT.RMiuCiJeqT4t YOlhcA9io7Xx1lotzCo2rYYHpMMSxOt5inJ6ZUle.X9nLB0IhAdpTYniPQpG ELiPQV9T6xrOPMlkj3LGDWifz63mXUqipTvMBiIWJILAE7wIjcg-- X-Yahoo-SMTP: t0UN_U2swBCFgwLIRu70LU92TrvpdQ-- Received: from [192.168.1.5] (mtheroux2@76.118.248.45 with plain) by smtp222.mail.gq1.yahoo.com with SMTP; 02 Dec 2012 13:36:00 -0800 PST Message-ID: <50BBC9C6.1050007@yahoo.com> Date: Sun, 02 Dec 2012 16:36:06 -0500 From: Mike User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Row caching + Wide row column family == almost crashed? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hello, We recently hit an issue within our Cassandra based application. We have a relatively new Column Family with some very wide rows (10's of thousands of columns, or more in some cases). During a periodic activity, we the range of columns to retrieve various pieces of information, a segment at a time. We do these same queries frequently at various stages of the process, and I thought the application could see a performance benefit from row caching. We have a small row cache (100MB per node) already enabled, and I enabled row caching on the new column family. The results were very negative. When performing range queries with a limit of 200 results, for a small minority of the rows in the new column family, performance plummeted. CPU utilization on the Cassandra node went through the roof, and it started chewing up memory. Some queries to this column family hung completely. According to the logs, we started getting frequent GCInspector messages. Cassandra started flushing the largest mem_tables due to hitting the "flush_largest_memtables_at" of 75%, and scaling back the key/row caches. However, to Cassandra's credit, it did not die with an OutOfMemory error. Its measures to emergency measures to conserve memory worked, and the cluster stayed up and running. No real errors showed in the logs, except for Messages getting drop, which I believe was caused by what was going on with CPU and memory. Disabling row caching on this new column family has resolved the issue for now, but, is there something fundamental about row caching that I am missing? We are running Cassandra 1.1.2 with a 6 node cluster, with a replication factor of 3. Thanks, -Mike