Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 70948 invoked from network); 4 Jun 2010 18:20:37 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Jun 2010 18:20:37 -0000 Received: (qmail 90866 invoked by uid 500); 4 Jun 2010 18:20:36 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 90842 invoked by uid 500); 4 Jun 2010 18:20:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 90834 invoked by uid 99); 4 Jun 2010 18:20:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Jun 2010 18:20:36 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [77.66.16.98] (HELO mail91.trifork.com) (77.66.16.98) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Jun 2010 18:20:28 +0000 Received: from MAIL01.interprise.dk (172.22.200.1) by MAIL91.interprise.dk (172.22.200.10) with Microsoft SMTP Server (TLS) id 8.2.247.2; Fri, 4 Jun 2010 20:20:05 +0200 Received: from [192.168.1.35] (90.184.68.213) by smtp.trifork.com (172.22.110.50) with Microsoft SMTP Server (TLS) id 8.2.213.0; Fri, 4 Jun 2010 20:20:03 +0200 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 (Apple Message framework v1078) Subject: Re: Are 6..8 seconds to read 23.000 small rows - as it should be? From: Per Olesen In-Reply-To: Date: Fri, 4 Jun 2010 20:20:04 +0200 Content-Transfer-Encoding: quoted-printable Message-ID: References: <1FCCC842-2F1F-4468-AB93-1A73FE9CD18A@trifork.com> To: "user@cassandra.apache.org" X-Mailer: Apple Mail (2.1078) On Jun 4, 2010, at 5:19 PM, Ben Browning wrote: > How many subcolumns are in each supercolumn and how large are the > values? Your example shows 8 subcolumns, but I didn't know if that was > the actual number. I've been able to read columns out of Cassandra at > an order of magnitude higher than what you're seeing here but there > are too many variables to directly compare. There are very few columns for each SC. About 8, but it varies a bit. = The column names and values are pretty small. around 20-30 bytes for = each column, I guess. So, we are talking small amounts of data here. Yes, I know there are too many variables, but I have the feeling - as = you also write - that the performance of this simple thing should be = orders of magnitude better.=20 So, how might I go about trying to find out why this takes so long time = in my specific setup? Can I get timings of stuff inside cassandra = itself? > Keep in mind that the results from each thrift call has to fit into > memory - you might be better off paging through the 23000 columns, > reading a few thousand at a time. Yes, I know. And I might end up doing this in the end. I do though have = pretty hard upper limits of how many rows I will end up with for each = key, but anyways it might be a good idea none the less. Thanks for the = advice on that one. Per