Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 54943 invoked from network); 12 Dec 2010 08:07:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Dec 2010 08:07:57 -0000 Received: (qmail 31781 invoked by uid 500); 12 Dec 2010 08:07:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 31727 invoked by uid 500); 12 Dec 2010 08:07:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 31719 invoked by uid 99); 12 Dec 2010 08:07:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Dec 2010 08:07:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of moyesyside@googlemail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Dec 2010 08:07:48 +0000 Received: by vws7 with SMTP id 7so3070913vws.31 for ; Sun, 12 Dec 2010 00:07:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=687G8cSEav1Uc12X+cmSShKyZ2H0cX3Lg9M/686xydg=; b=L/dJ6Hlb9aICKqeQuB+WWVq7I8UUyCDSoVyeKKMSVwoXHsX2BgnKJCXi889btrhRSM LdvdS2ctL/GQ/nzYvj8zN5nHQ5wpeBKYmc5vWh0bhDon/LB306nUQhTx432rldoE9the B/5vNPi260SJFkUHp9UCDPnj3RYD3+B6IRqXc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=xGa+kUPVECXwKSHSXhmAYkEEhbPkqXNdvPu18VLe8Egp1OR/o2/eQggjdXUVOUqn3A tIAAwHWvKoF/5u5IghvI5NKW73UqLcEhlZK4f4m1hDXYRH3PC6BrUfojhyWrL/DlDPX7 s829oH0EUGqmnw4phbUqtUVrMgwuyanWE/arE= MIME-Version: 1.0 Received: by 10.220.187.198 with SMTP id cx6mr878904vcb.56.1292141247813; Sun, 12 Dec 2010 00:07:27 -0800 (PST) Received: by 10.220.167.194 with HTTP; Sun, 12 Dec 2010 00:07:27 -0800 (PST) In-Reply-To: References: Date: Sun, 12 Dec 2010 19:07:27 +1100 Message-ID: Subject: Re: OutOfMemory on count on cassandra 0.6.8 for large number of columns From: Dave Martin To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks Tyler. I was unaware of counters. The use case for column counts is really from a operational perspective, to allow a sysadmin to do adhoc checks on columns to see if something has gone wrong in software outside of cassandra. I think running a cassandra-cli command such as count, which makes cassandra fall over is not ideal, unless we can say for X number of columns cassandra needs at least Y memory allocation for stability. Cheers Dave On Sun, Dec 12, 2010 at 6:39 PM, Tyler Hobbs wrote: > Cassandra has to deserialize all of the columns in the row for get_count(= ). > So from Cassandra's perspective, it's almost as much work as getting the > entire row, it just doesn't have to send everything back over the network= . > > If you're frequently counting 8 million columns (or really, anything > significant), you need to use counters instead.=A0 If this is a rare > occurrence, you can do the count in multiple chunks by using a starting a= nd > ending column in the SlicePredicate for each chunk, but this requires som= e > rough knowledge about the distribution of the column names in the row. > > - Tyler