cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee Parker <>
Subject Re: cassandra instability
Date Fri, 16 Apr 2010 19:30:10 GMT
I don't think it is a hardware issue.  This is happening on multiple servers
and clients on ec2 instances and my local development VM.  I think you are
right that the timestamp errors are likely being cause by the Thrift PHP
bindings.  The frustrating part is that I can't get the error to
consistently reproduce when I have debugging systems in place.

As for the Memtable thresholds, when I ran with lower thresholds, the server
would be thrashing with compaction runs due to the dramatically increased
number of sstable files.  That was when I was running 0.5.0.  Has 0.6.0
improved compaction performance such that this shouldn't be an issue?

Lee Parker
On Fri, Apr 16, 2010 at 1:13 PM, Jonathan Ellis <> wrote:

> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <> wrote:
> > Each time I start it up, it will
> > work fine for about 1 hour and then it will crash the servers.  The error
> > message on the servers is usually an out of memory error.
> Sounds like
> to me.
> > I will get
> > several time out errors on the clients
> Symtomatic of running out of memory.
> > and occasionally get an error telling
> > me that i was missing the timestamp.
> This is an entirely different problem.  Your client is sending
> garbage, plain and simple.  Why that is, I don't know.  The PHP Thrift
> binding is virtually unmaintained, so it could be a bug there, but
> Digg uses PHP against Cassandra extensively and hasn't hit this to my
> knowledge.  As I said in another thread, I wouldn't rule out bad
> hardware.
> > The timestamp error is accompanied by
> > a server crashing if I use framed transport instead of buffered.
> Thrift is fragile when the client sends it garbage.
> (
> > One of the reasons we
> > were trying cassandra was to scale out with smaller nodes rather than
> having
> > to run larger instances for mysql.
> 2 x 1GB isn't a whole lot to do a bulk load with.  You may have to
> throttle your clients to fix the OOM completely.
> -Jonathan

View raw message