Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 67990 invoked from network); 14 Mar 2011 01:12:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Mar 2011 01:12:24 -0000 Received: (qmail 57676 invoked by uid 500); 14 Mar 2011 01:12:21 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 57653 invoked by uid 500); 14 Mar 2011 01:12:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 57645 invoked by uid 99); 14 Mar 2011 01:12:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 01:12:21 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a50.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 01:12:15 +0000 Received: from homiemail-a50.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a50.g.dreamhost.com (Postfix) with ESMTP id B8E3B6F8058 for ; Sun, 13 Mar 2011 18:11:54 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=PSZ/+ePzEG kfCjgcnr9wPXlsQxgqxHmIP2zzUN/ozi/DlmxrIuG6vOaHcFwvaPjrqrmsPEdkSb Pq+0MqQONIdJXq+HiCcishbzY/LSaCv8XdU7CiULsg05vwMqejLGTyh+NwvoXBkZ Rmfk5uJaHs/cMbENUBgMUXo8z9fGLjUSQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=kGGGX4g9fWAQ1XL9 Rn0htwONJb4=; b=RTfSmgEcYiOKnmcsgxu0OD/mbbtb1eMbVqzovR7AbUfeHdx8 E4cHz5pvm8bkyTI3fMbMSIOj5gtolhWqYjwpLDlE2MRI14Oe7yx0n4V+nh5Hm2Mp GsnmytNzTFv4SRryjqGOXwMsJ2y5PysbfouyHKiOkDOHDo9PJMKr275TXBk= Received: from [10.0.1.159] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a50.g.dreamhost.com (Postfix) with ESMTPSA id 0DFBE6F8057 for ; Sun, 13 Mar 2011 18:11:53 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1082.1) Content-Type: multipart/alternative; boundary=Apple-Mail-6--364635884 Subject: Re: get_range_slices perf Date: Mon, 14 Mar 2011 14:11:51 +1300 In-Reply-To: <5FAFCB6230A1754CBC58296E62D8D5E602E8BD673E@pa-ex-01.YOJOE.local> To: user@cassandra.apache.org References: <5FAFCB6230A1754CBC58296E62D8D5E602E8BD673E@pa-ex-01.YOJOE.local> Message-Id: <43943503-1C22-491B-B127-AD2B501D811C@thelastpickle.com> X-Mailer: Apple Mail (2.1082.1) --Apple-Mail-6--364635884 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 What are you using for the SlicePredicate with get_range_slices() ? What = sort of performance are you getting for each request (client and server = side)?=20 Even if you are asking for zero columns, there is still a lot of work to = be done when performing a range scan. e.g. Each SSTable must be checked = and the columns reduced to the current view. At first glance this does = not look right though. I think the more SSTables and the more tombstones you have, the worse = the performance will be.=20 Hope that helps. Aaron =20 On 14 Mar 2011, at 12:11, Jeffrey Wang wrote: > Hey all, > =20 > I=92m trying to get a list of all the rows from a column family using = get_range_slices retrieving no actual columns. I expected this operation = to be pretty quick, but it seems to take a while (5-node 0.7.0 cluster = takes 20 min to page through 60k keys 1000 at a time). It=92s not = completely clear to me from the code, but is there a lot of SSTable = reading involved when getting just the row names? And is this the best = way to read all of the row names in a CF? Thanks. > =20 > -Jeffrey > =20 --Apple-Mail-6--364635884 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 What are you using for the SlicePredicate with = get_range_slices() ? What sort of performance are you getting for = each request (client and server side)? 

Even if = you are asking for zero columns, there is still a lot of work to be done = when performing a range scan. e.g. Each SSTable must be checked and the = columns reduced to the current view. At first glance this does not = look right though.

I think the more SSTables = and the more tombstones you have, the worse the performance will = be. 

Hope that = helps.
Aaron

 
On 14 Mar 2011, at 12:11, Jeffrey Wang wrote:

Hey all,
I=92m trying to get a list of all = the rows from a column family using get_range_slices retrieving no = actual columns. I expected this operation to be pretty quick, but it = seems to take a while (5-node 0.7.0 cluster takes 20 min to page through = 60k keys 1000 at a time). It=92s not completely clear to me from the = code, but is there a lot of SSTable reading involved when getting just = the row names? And is this the best way to read all of the row names in = a CF? Thanks.