Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 7367 invoked from network); 30 Mar 2010 08:48:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Mar 2010 08:48:51 -0000 Received: (qmail 1650 invoked by uid 500); 30 Mar 2010 08:48:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 1527 invoked by uid 500); 30 Mar 2010 08:48:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 1519 invoked by uid 99); 30 Mar 2010 08:48:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Mar 2010 08:48:50 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.210.186] (HELO mail-yx0-f186.google.com) (209.85.210.186) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Mar 2010 08:48:44 +0000 Received: by yxe16 with SMTP id 16so1715217yxe.9 for ; Tue, 30 Mar 2010 01:48:22 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.155.201 with HTTP; Tue, 30 Mar 2010 01:48:21 -0700 (PDT) In-Reply-To: References: Date: Tue, 30 Mar 2010 14:48:21 +0600 Received: by 10.101.145.15 with SMTP id x15mr2819919ann.119.1269938901872; Tue, 30 Mar 2010 01:48:21 -0700 (PDT) Message-ID: Subject: Re: Poor performance; PHP & Thrift to blame From: Rauan Maemirov To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org What about APC? Did you turn it on? 2010/3/30 Julian Simon : > Hi, > > I've been trying to benchmark Cassandra for our use case and have been > seeing poor performance on both writes and (extremely) poor > performance on reads. > > Using Cassandra 0.51 stable & thrift-0.2.0. > > It turns out all the CPU time is going to the PHP client process - the > JVM operating the Cassandra server isn't breaking much of a sweat. > > For reads the latency is often up to 1 second to fetch a row > containing ~2000 columns, or around 300ms to fetch a 500-column wide > row. =A0This is with get_slice(), and a predicate specifying the start & > finish range. > > Using cachegrind and inspecting the code inside the Thrift bindings > makes it pretty clear why the performance is so bad, particularly on > reads. The biggest culprit is the translation code which casts data > back and forth into binary representations for sending over the wire > to the Cassandra server. > > There seems to be some 32-bit specific code which iterates heavily > apparently due to a limitation in PHPs implementation of LONGs. > > However, testing on a 64-bit host doesn't yield any performance improveme= nt. > > More surprisingly, if I compile and enable the PHP native thrift > bindings (following this guide > https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP) > read performance actually degrades by another 50%. =A0I have verified > that the Thrift code is recognizing and using the native PHP functions > provided by the library. > > I've tested all of this on both 32-bit and 64-bit installations of > both PHP 5.1 & 5.2. =A0Results are the same in all cases. > > My environment is on vanilla CentOS 5.4 server installations inside > VMWare on a 4 core 64bit host with plenty of RAM and fast disks. > > Has anyone been able to produce decent performance with PHP & > Cassandra? =A0If so, how have you done it? > > Thanks, > Jules >