cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arash Bazrafshan <ara...@gmail.com>
Subject Bug in Cassandra that occurs when removing a supercolumn.
Date Sat, 03 Apr 2010 12:27:39 GMT
ello.

A bug occurs for me when working with Cassandra.

With this e-mail I intend to show what I do to recreate it, and then perhaps
you can try it out too.

SUMMARY OF THE BUG:
   (1): insert a row with a supercolumn that contains a subcolumn.
   (2) remove the supercolumn.
   (3) reinsert the same row with the same supercolumn and subcolumn.
   (RESULT): You won't be able to retrieve the entire supercolumn. However
you will be able to retrieve the specific subcolumn within the supercolumn.
Removing cassandra's data&logs will make the problem to go away.

PREREQUISITES:
* Use the column families that are defined by storage-conf.xml in its
default "out-of-the-box" configuration. Specifically I will use the keyspace
"Keyspace1" with the supercolumn "Super1".
* I use Cassandra 0.5.0-1 on Ubuntu Karmic 9.10.
* I use Thrift 0.2.0 to generate a php api for cassandra. It is when I use
this api that the bug occurs.
* I run Cassandra on a single node. So I query against 127.0.0.1.

STEP-BY-STEP INSTRUCTIONS FOR TRIGGERING THE BUG:

I will now step by step show the PHP scripts that I execute in order to
generate the bug.

STEP 1: EXECUTE THIS SCRIPT.

//We will first insert a row into the supercolumn family Super1.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$writeColumnPath = new cassandra_ColumnPath();

$writeColumnPath->column_family = 'Super1';
$writeColumnPath->super_column = 'info';
$writeColumnPath->column = 'phonenumber';

$client->insert (
    'Keyspace1',
    'adam',
    $writeColumnPath,
    '02012312345',
    time(),
    cassandra_ConsistencyLevel::ZERO
);

$transport->close();

//===============================================

RESULT OF STEP 1: The row that contains a single supercolumn with a single
column has been inserted.



STEP 2: EXECUTE THIS SCRIPT.

//Next we will fetch the supercolumn of the row that we just inserted, just
to make sure that the subcolumn is really there.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = null; //NOTE: We want to fetch the entire
supercolumn.

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->super_column->columns[0]->value;

$transport->close();

//===============================================

RESULT OF STEP 2: You receive the following output: 02012312345



STEP 3: EXECUTE THIS SCRIPT.

//Now we will remove the supercolumn of the row, but we will keep the row
itself.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$deleteColumnPath = new cassandra_ColumnPath();

$deleteColumnPath->column_family = 'Super1';
$deleteColumnPath->super_column = 'info';
$deleteColumnPath->column = null; //NOTE: We want to remove the entire
supercolumn 'info'.

$client->remove (
    'Keyspace1',
    'adam',
    $deleteColumnPath,
    time(),
    cassandra_ConsistencyLevel::ZERO
);


$transport->close();

//===============================================

RESULT OF STEP 3: The row is removed from the column family.



STEP 4: EXECUTE THIS SCRIPT.

//Now let's try to fetch the column within the supercolumn again, just to
make sure it is really gone.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = null; //NOTE: Fetching the entire supercolumn.

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->super_column->columns[0]->value;

$transport->close();

//===============================================

RESULT OF STEP 4: A NotFoundException is thrown.

STEP 5: EXECUTE THIS SCRIPT.

//Now we will insert the exact same row again, containing the same
supercolumn and column.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$writeColumnPath = new cassandra_ColumnPath();

$writeColumnPath->column_family = 'Super1';
$writeColumnPath->super_column = 'info';
$writeColumnPath->column = 'phonenumber';

$client->insert (
    'Keyspace1',
    'adam',
    $writeColumnPath,
    '02012312345',
    time(),
    cassandra_ConsistencyLevel::ZERO
);

$transport->close();

//===============================================

RESULT OF STEP 5: The row that contains a single supercolumn with a single
column has been inserted.

STEP 6: EXECUTE THIS SCRIPT (THE BUG WILL APPEAR HERE).

//Now we will try to fetch the supercolumn within the row again. This is
where the bug appears.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = null; //NOTE: We are fetching the entire
supercolumn 'info'

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->super_column->columns[0]->value;

$transport->close();

//===============================================

RESULT OF STEP 6: A NotFoundException is still thrown, even if the row has
been inserted again.

STEP 7: EXECUTE THIS SCRIPT.

//Now let's get the same column again, but only this time we won't fetch its
entire supercolumn but only the column itself. The difference between this
step and the previous has been marked in the code.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = 'phonenumber'; //NOTE: This time we will fetch the
specific column.

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->column->value;

$transport->close();

//===============================================

RESULT OF STEP 7: You receive the following output: 02012312345.

STEP 8: SHUT DOWN CASSANDRA & KILL JAVA & REMOVE CASSANDRA'S DATA FILES AND
COMMIT LOGS.

STEP 9: RESTART CASSANDRA.

STEP 10: Reiterate STEP 1 and STEP 2 to see that the bug has disappeared and
the column value is fetched appropriately.

CONCLUSION: I have tried this one out with various consistency levels. The
same thing happens. Next I'll try to insert and remove using other methods
if the Thrift API allows for it.

I have included some of Cassandra's conf files so you can see how I've
configured my setup. Perhaps I am doing something wrong there?

Mime
View raw message