cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee Parker <...@socialagency.com>
Subject Re: cassandra instability
Date Mon, 19 Apr 2010 00:28:57 GMT
I did regenerate the thrift bindings.  What I have found in testing is that
the batch_mutate command occasionally sends bad data to thrift when i try to
insert a set of items with too many columns.  I don't know if this is a
problem with PHP, or the thrift PHP library.  I have found that a limit of
1000 columns is perfectly fast enough for my needs and stable. Previously, I
was regularly sending 6000 columns (500 rows with about 12 columns each).
 Most of the columns in each row was fairly small, but some of the rows had
a rather large block of text.  When this was happening, the output of the
TBinaryProtocol would actually be incorrect at seemingly random times.  This
would then cause an error from cassandra saying that I was missing my
timestamp.  Enough of these errors and cassandra would crash with an out of
memory error.  If enough data was on the servers when this happened,
cassandra couldn't recover from the error because I didn't have enough
memory on the machines.  I have now upgraded to larger machines and that has
cleared up the real memory issues.

Lee Parker
On Sun, Apr 18, 2010 at 6:51 PM, Brandon Williams <driftx@gmail.com> wrote:

> On Fri, Apr 16, 2010 at 12:50 PM, Lee Parker <lee@socialagency.com> wrote:
>
>> This process is running on two clients each working on a separate part of
>> the mysql data which totals to about 70G.  Each time I start it up, it will
>> work fine for about 1 hour and then it will crash the servers.  The error
>> message on the servers is usually an out of memory error.  I will get
>> several time out errors on the clients and occasionally get an error telling
>> me that i was missing the timestamp.  The timestamp error is accompanied by
>> a server crashing if I use framed transport instead of buffered.  I wasn't
>> having the out of memory errors with 0.5.0, but had lots of timeouts and
>> some "unknown result" errors.  So we upgraded to 0.6.0 when it became the
>> stable release.
>>
>
> Did you regenerate the php thrift bindings between 0.5 and 0.6?  There's a
> decent chance that thrift made some kind of backwards incompatible change
> between those revisions (look in the lib dir of each cassandra version to
> determine the thrift svn revision you need.)  If that happened, then it's
> possible the old bindings are sending something the newer version does not
> understand, and causing you to run into THRIFT-601, crashing the server.
>
> -Brandon
>

Mime
View raw message