incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yaw <yawy...@gmail.com>
Subject Re: Poor performance; PHP & Thrift to blame
Date Tue, 30 Mar 2010 12:51:02 GMT
Hi David,
I have seen your guide at
https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP.

I use  Cassandra with a PHP client ..
Until now, I am using Thrift PHP classes that I found into Pandra project
(high level PHP client) as I was unable to install or build thrift compiler
on my old Etch Debian OS.


I can not found native PHP extension you are speaking about... I don't
understand if this extension can replace PHP classes that are generated
with thrift compiler


2010/3/30 David Timothy Strauss <david@fourkitchens.com>

> This sounds like the sort of analysis that shouldn't be done in PHP. Have
> you tried Hadoop + Cassandra 0.6?
>
> -----Original Message-----
> From: Julian Simon <jsimon@jules.com.au>
> Date: Tue, 30 Mar 2010 22:21:22
> To: <user@cassandra.apache.org>
> Subject: Re: Poor performance; PHP & Thrift to blame
>
> Yes I tested it with and without APC - it had a negligible impact on
> performance.
>
> This didn't surprise me - most of the optimization that APC offers is
> in the parsing of PHP code; seeing as the benchmark is a single PHP
> process the code parsing overhead occurs outside the benchmark loop.
>
> Does anyone have any benchmarks for larger Cassandra queries from PHP
> similar to what I'm trying to do?  The performance bottlenecks don't
> show up on 1,5,10, or even 100 column query sets - only for larger
> sets or query loops.
>
> Anyone doing time series analysis?  This is the sort of use case where
> I'd expect to see much larger query sets.
>
> I suppose Facebook and Digg are only pulling out small column sets, so
> they wouldn't necessarily notice this issue.
>
>
>
> On Tue, Mar 30, 2010 at 8:00 PM, David Timothy Strauss
> <david@fourkitchens.com> wrote:
> > Without APC, there should be even more of an improvement with the Thrift
> PHP extension.
> >
> > ----- "Rauan Maemirov" <rauan@maemirov.com> wrote:
> >
> >> What about APC? Did you turn it on?
> >>
> >> 2010/3/30 Julian Simon <jsimon@jules.com.au>:
> >> > Hi,
> >> >
> >> > I've been trying to benchmark Cassandra for our use case and have
> >> been
> >> > seeing poor performance on both writes and (extremely) poor
> >> > performance on reads.
> >> >
> >> > Using Cassandra 0.51 stable & thrift-0.2.0.
> >> >
> >> > It turns out all the CPU time is going to the PHP client process -
> >> the
> >> > JVM operating the Cassandra server isn't breaking much of a sweat.
> >> >
> >> > For reads the latency is often up to 1 second to fetch a row
> >> > containing ~2000 columns, or around 300ms to fetch a 500-column
> >> wide
> >> > row.  This is with get_slice(), and a predicate specifying the start
> >> &
> >> > finish range.
> >> >
> >> > Using cachegrind and inspecting the code inside the Thrift bindings
> >> > makes it pretty clear why the performance is so bad, particularly
> >> on
> >> > reads. The biggest culprit is the translation code which casts data
> >> > back and forth into binary representations for sending over the
> >> wire
> >> > to the Cassandra server.
> >> >
> >> > There seems to be some 32-bit specific code which iterates heavily
> >> > apparently due to a limitation in PHPs implementation of LONGs.
> >> >
> >> > However, testing on a 64-bit host doesn't yield any performance
> >> improvement.
> >> >
> >> > More surprisingly, if I compile and enable the PHP native thrift
> >> > bindings (following this guide
> >> > https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP)
> >> > read performance actually degrades by another 50%.  I have verified
> >> > that the Thrift code is recognizing and using the native PHP
> >> functions
> >> > provided by the library.
> >> >
> >> > I've tested all of this on both 32-bit and 64-bit installations of
> >> > both PHP 5.1 & 5.2.  Results are the same in all cases.
> >> >
> >> > My environment is on vanilla CentOS 5.4 server installations inside
> >> > VMWare on a 4 core 64bit host with plenty of RAM and fast disks.
> >> >
> >> > Has anyone been able to produce decent performance with PHP &
> >> > Cassandra?  If so, how have you done it?
> >> >
> >> > Thanks,
> >> > Jules
> >> >
> >
> > --
> > David Strauss
> >   | david@fourkitchens.com
> >   | +1 512 577 5827 [mobile]
> > Four Kitchens
> >   | http://fourkitchens.com
> >   | +1 512 454 6659 [office]
> >   | +1 512 870 8453 [direct]
> >
>

Mime
View raw message