Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 38163 invoked from network); 30 Mar 2010 11:21:53 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Mar 2010 11:21:53 -0000 Received: (qmail 40751 invoked by uid 500); 30 Mar 2010 11:21:53 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 40352 invoked by uid 500); 30 Mar 2010 11:21:51 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 40339 invoked by uid 99); 30 Mar 2010 11:21:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Mar 2010 11:21:50 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.92.26] (HELO qw-out-2122.google.com) (74.125.92.26) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Mar 2010 11:21:44 +0000 Received: by qw-out-2122.google.com with SMTP id 8so3843263qwh.61 for ; Tue, 30 Mar 2010 04:21:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.222.140 with HTTP; Tue, 30 Mar 2010 04:21:22 -0700 (PDT) In-Reply-To: <1552551731.81663.1269939633043.JavaMail.root@mail-2.01.com> References: <1552551731.81663.1269939633043.JavaMail.root@mail-2.01.com> Date: Tue, 30 Mar 2010 22:21:22 +1100 Received: by 10.229.191.138 with SMTP id dm10mr2219734qcb.52.1269948082676; Tue, 30 Mar 2010 04:21:22 -0700 (PDT) Message-ID: Subject: Re: Poor performance; PHP & Thrift to blame From: Julian Simon To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Yes I tested it with and without APC - it had a negligible impact on performance. This didn't surprise me - most of the optimization that APC offers is in the parsing of PHP code; seeing as the benchmark is a single PHP process the code parsing overhead occurs outside the benchmark loop. Does anyone have any benchmarks for larger Cassandra queries from PHP similar to what I'm trying to do? The performance bottlenecks don't show up on 1,5,10, or even 100 column query sets - only for larger sets or query loops. Anyone doing time series analysis? This is the sort of use case where I'd expect to see much larger query sets. I suppose Facebook and Digg are only pulling out small column sets, so they wouldn't necessarily notice this issue. On Tue, Mar 30, 2010 at 8:00 PM, David Timothy Strauss wrote: > Without APC, there should be even more of an improvement with the Thrift = PHP extension. > > ----- "Rauan Maemirov" wrote: > >> What about APC? Did you turn it on? >> >> 2010/3/30 Julian Simon : >> > Hi, >> > >> > I've been trying to benchmark Cassandra for our use case and have >> been >> > seeing poor performance on both writes and (extremely) poor >> > performance on reads. >> > >> > Using Cassandra 0.51 stable & thrift-0.2.0. >> > >> > It turns out all the CPU time is going to the PHP client process - >> the >> > JVM operating the Cassandra server isn't breaking much of a sweat. >> > >> > For reads the latency is often up to 1 second to fetch a row >> > containing ~2000 columns, or around 300ms to fetch a 500-column >> wide >> > row. =A0This is with get_slice(), and a predicate specifying the start >> & >> > finish range. >> > >> > Using cachegrind and inspecting the code inside the Thrift bindings >> > makes it pretty clear why the performance is so bad, particularly >> on >> > reads. The biggest culprit is the translation code which casts data >> > back and forth into binary representations for sending over the >> wire >> > to the Cassandra server. >> > >> > There seems to be some 32-bit specific code which iterates heavily >> > apparently due to a limitation in PHPs implementation of LONGs. >> > >> > However, testing on a 64-bit host doesn't yield any performance >> improvement. >> > >> > More surprisingly, if I compile and enable the PHP native thrift >> > bindings (following this guide >> > https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP) >> > read performance actually degrades by another 50%. =A0I have verified >> > that the Thrift code is recognizing and using the native PHP >> functions >> > provided by the library. >> > >> > I've tested all of this on both 32-bit and 64-bit installations of >> > both PHP 5.1 & 5.2. =A0Results are the same in all cases. >> > >> > My environment is on vanilla CentOS 5.4 server installations inside >> > VMWare on a 4 core 64bit host with plenty of RAM and fast disks. >> > >> > Has anyone been able to produce decent performance with PHP & >> > Cassandra? =A0If so, how have you done it? >> > >> > Thanks, >> > Jules >> > > > -- > David Strauss > =A0 | david@fourkitchens.com > =A0 | +1 512 577 5827 [mobile] > Four Kitchens > =A0 | http://fourkitchens.com > =A0 | +1 512 454 6659 [office] > =A0 | +1 512 870 8453 [direct] >