Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 99585 invoked from network); 31 Mar 2010 03:48:18 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 31 Mar 2010 03:48:18 -0000 Received: (qmail 61280 invoked by uid 500); 31 Mar 2010 03:48:17 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 61065 invoked by uid 500); 31 Mar 2010 03:48:17 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 61057 invoked by uid 500); 31 Mar 2010 03:48:17 -0000 Delivered-To: apmail-incubator-cassandra-user@incubator.apache.org Received: (qmail 61054 invoked by uid 99); 31 Mar 2010 03:48:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Mar 2010 03:48:16 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jamesgolick@gmail.com designates 209.85.217.209 as permitted sender) Received: from [209.85.217.209] (HELO mail-gx0-f209.google.com) (209.85.217.209) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Mar 2010 03:48:09 +0000 Received: by gxk1 with SMTP id 1so6752091gxk.16 for ; Tue, 30 Mar 2010 20:47:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:received:message-id :subject:from:to:content-type; bh=MQ0Cv+wuZGvT22aHagyOUe3+7FD53tcP6HpczLjghAI=; b=QnRW7xaHjqVXEO2DaYVQPID9wqpL2FbUJd4kntIrBZL7g/+jBaKAMH2aZsddy8hL1O yaXzcCPs+K/Zs3SWRr0Zr6Q8e7ji0O5FsF78Lv16FWaT5+H7RPNXR46VX/WM4WTgcVr/ c7hgvczWNJPrHyH0YfbtfJb0COIuqiDp4NpFQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=xFoP0bpMPwe4t06VFOj50CUYWOEkL9mZsvbgaYl/Lsn930ite3LBDX7Uq4c6kOivsF I+EThmiTx/8SRSXTg6x5uTouFznAbaSsbkDNr66foMMVkQ+YkGIoaJ36YJW+hfb7waZn 3QmTy9s/K1YXPZAYIYNpUjMxUscn60YZAPqag= MIME-Version: 1.0 Received: by 10.231.209.11 with HTTP; Tue, 30 Mar 2010 20:47:48 -0700 (PDT) Date: Tue, 30 Mar 2010 20:47:48 -0700 Received: by 10.100.244.24 with SMTP id r24mr3115124anh.216.1270007268114; Tue, 30 Mar 2010 20:47:48 -0700 (PDT) Message-ID: <1ab2da821003302047h128aaed9g1a65f3972836b086@mail.gmail.com> Subject: Read Performance From: James Golick To: cassandra-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0016e6d447d41402e00483109cb6 X-Virus-Checked: Checked by ClamAV on apache.org --0016e6d447d41402e00483109cb6 Content-Type: text/plain; charset=ISO-8859-1 We are starting to use cassandra to power our activity feed. The way we organize our data is simple. "Event"s live in a CF called Events and are keyed by a UUID. The timelines themselves live in a CF called Timelines, which is keyed by user id (i.e. "1229") and contains a event uuids as column names (sorted by TimeUUIDType). To load a feed, we get a slice of the timeline CF for that user, then multiget all of the corresponding events. Loading the slice of the timeline is reasonably fast at 4-6ms. But, multigetting the events is terribly slow - on the order of 35-100ms. To alleviate the problem, we write events through to memcached and use a memcached multiget in front of the cassandra multiget. We have enough cache space to get upwards of a 99% hit rate, which makes loading the events extremely fast, but it would be nice to make use of the 24GB of memory in our cassandra nodes. We're on 0.6, and I've enabled the row cache. It seems to have data in it, but it's still slow. So, am I doing something wrong, or is this the expected perf? - James --0016e6d447d41402e00483109cb6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable We are starting to use cassandra to power our activity feed. The way we org= anize our data is simple. "Event"s live in a CF called Events and= are keyed by a UUID. The timelines themselves live in a CF called Timeline= s, which is keyed by user id (i.e. "1229") and contains a event u= uids as column names (sorted by TimeUUIDType).

To load a feed, we get a slice of the timeline CF for that u= ser, then multiget all of the corresponding events.

Loading the slice of the timeline is reasonably fast at 4-6ms. But, multi= getting the events is terribly slow - on the order of 35-100ms.

To alleviate the problem, we write events through to me= mcached and use a memcached multiget in front of the cassandra multiget. We= have enough cache space to get upwards of a 99% hit rate, which makes load= ing the events extremely fast, but it would be nice to make use of the 24GB= of memory in our cassandra nodes.

We're on 0.6, and I've enabled the row cache. I= t seems to have data in it, but it's still slow.

So, am I doing something wrong, or is this the expected perf?

- James
--0016e6d447d41402e00483109cb6--