cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juho Mäkinen <juho.maki...@gmail.com>
Subject get_slice sometimes returns previous result on php
Date Mon, 30 Aug 2010 13:05:56 GMT
I've ran into a strange bug where get_slice returns the result from
previous query. My application iterates over a set of columns inside a
supercolumn and for some reason it sometimes (quite rarely but often
enough that it shows up) the results gets "shifted" around so that the
application gets the previous result. The application is using the
same cassandra thrift connection (it doesn't close it in between) and
everything is happening inside same php process.

Here's a cleaned up example from logs where this happens:

14:40 suomirock php-fi: [MISC] WARNING /blog.php: Cassandra stored
blog content for blog id 47528165 differs from database content.
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from cassandra: AAAAAAAA
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from database : BBBBBBBBB

14:40 suomirock php-fi: [MISC] WARNING /blog.php: Cassandra stored
blog content for blog id 47523032 differs from database content.
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from cassandra: BBBBBBBBB
14:40 suomirock php-fi: [MISC] WARNING /blog.php: from database : CCCCCCCCCC

The data model is that I have a Super Column family which stores blog
entries. Each user has a single row. Inside this row there are CF's
where each CF contains a single blog entry. The key of the CF is the
blog id number and one of the columns inside the CF contains the blog
content.

The data which is in cassandra is correctly there and it's the same
what's inside our old storage tier (PostgreSQL) so I'm able to compare
the data returned from cassandra with the data returned from old
database.
Here's part of the output from cassandra-cli where I queried the row
for this user. As you can see, the "blog id" matches the super_column
inside cassandra.

=> (super_column=47540671, (column=content, value=AAAAAAAA,
timestamp=1282940401925456) )
=> (super_column=47528165, (column=content, value=BBBBBBBBB,
timestamp=1282940401925456) )
=> (super_column=47523032, (column=content, value=CCCCCCCCCC,
timestamp=1282940401925456) )

I'm in the middle of writing bunch of debugging code to get better
data what's really going on, but I'd be very happy if someone could
have any clue or helpful ideas how to debug this out.

 - Juho Mäkinen

Mime
View raw message