ello.

A bug occurs for me when working with Cassandra.

With this e-mail I intend to show what I do to recreate it, and then perhaps you can try it out too.

SUMMARY OF THE BUG:
   (1): insert a row with a supercolumn that contains a subcolumn.
   (2) remove the supercolumn.
   (3) reinsert the same row with the same supercolumn and subcolumn.
   (RESULT): You won't be able to retrieve the entire supercolumn. However you will be able to retrieve the specific subcolumn within the supercolumn. Removing cassandra's data&logs will make the problem to go away.

PREREQUISITES:
* Use the column families that are defined by storage-conf.xml in its default "out-of-the-box" configuration. Specifically I will use the keyspace "Keyspace1" with the supercolumn "Super1".
* I use Cassandra 0.5.0-1 on Ubuntu Karmic 9.10.
* I use Thrift 0.2.0 to generate a php api for cassandra. It is when I use this api that the bug occurs.
* I run Cassandra on a single node. So I query against 127.0.0.1.

STEP-BY-STEP INSTRUCTIONS FOR TRIGGERING THE BUG:

I will now step by step show the PHP scripts that I execute in order to generate the bug.

STEP 1: EXECUTE THIS SCRIPT.

//We will first insert a row into the supercolumn family Super1.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$writeColumnPath = new cassandra_ColumnPath();

$writeColumnPath->column_family = 'Super1';
$writeColumnPath->super_column = 'info';
$writeColumnPath->column = 'phonenumber';

$client->insert (
    'Keyspace1',
    'adam',
    $writeColumnPath,
    '02012312345',
    time(),
    cassandra_ConsistencyLevel::ZERO
);

$transport->close();

//===============================================

RESULT OF STEP 1: The row that contains a single supercolumn with a single column has been inserted.



STEP 2: EXECUTE THIS SCRIPT.

//Next we will fetch the supercolumn of the row that we just inserted, just to make sure that the subcolumn is really there.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = null; //NOTE: We want to fetch the entire supercolumn.

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->super_column->columns[0]->value;

$transport->close();

//===============================================

RESULT OF STEP 2: You receive the following output: 02012312345



STEP 3: EXECUTE THIS SCRIPT.

//Now we will remove the supercolumn of the row, but we will keep the row itself.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$deleteColumnPath = new cassandra_ColumnPath();

$deleteColumnPath->column_family = 'Super1';
$deleteColumnPath->super_column = 'info';
$deleteColumnPath->column = null; //NOTE: We want to remove the entire supercolumn 'info'.

$client->remove (
    'Keyspace1',
    'adam',
    $deleteColumnPath,
    time(),
    cassandra_ConsistencyLevel::ZERO
);


$transport->close();

//===============================================

RESULT OF STEP 3: The row is removed from the column family.



STEP 4: EXECUTE THIS SCRIPT.

//Now let's try to fetch the column within the supercolumn again, just to make sure it is really gone.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = null; //NOTE: Fetching the entire supercolumn.

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->super_column->columns[0]->value;

$transport->close();

//===============================================

RESULT OF STEP 4: A NotFoundException is thrown.

STEP 5: EXECUTE THIS SCRIPT.

//Now we will insert the exact same row again, containing the same supercolumn and column.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$writeColumnPath = new cassandra_ColumnPath();

$writeColumnPath->column_family = 'Super1';
$writeColumnPath->super_column = 'info';
$writeColumnPath->column = 'phonenumber';

$client->insert (
    'Keyspace1',
    'adam',
    $writeColumnPath,
    '02012312345',
    time(),
    cassandra_ConsistencyLevel::ZERO
);

$transport->close();

//===============================================

RESULT OF STEP 5: The row that contains a single supercolumn with a single column has been inserted.

STEP 6: EXECUTE THIS SCRIPT (THE BUG WILL APPEAR HERE).

//Now we will try to fetch the supercolumn within the row again. This is where the bug appears.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = null; //NOTE: We are fetching the entire supercolumn 'info'

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->super_column->columns[0]->value;

$transport->close();

//===============================================

RESULT OF STEP 6: A NotFoundException is still thrown, even if the row has been inserted again.

STEP 7: EXECUTE THIS SCRIPT.

//Now let's get the same column again, but only this time we won't fetch its entire supercolumn but only the column itself. The difference between this step and the previous has been marked in the code.

//===============================================

$socket = new TSocket("127.0.0.1", 30003);
$transport = new TBufferedTransport($socket, 1024, 1024);
$protocol = new TBinaryProtocolAccelerated($transport);
$client = new CassandraClient ($protocol);
$transport->open();

$readColumnPath = new cassandra_ColumnPath();

$readColumnPath->column_family = 'Super1';
$readColumnPath->super_column = 'info';
$readColumnPath->column = 'phonenumber'; //NOTE: This time we will fetch the specific column.

$res = $client->get (
    'Keyspace1',
    'adam',
    $readColumnPath,
     cassandra_ConsistencyLevel::ONE
);

echo $res->column->value;

$transport->close();

//===============================================

RESULT OF STEP 7: You receive the following output: 02012312345.

STEP 8: SHUT DOWN CASSANDRA & KILL JAVA & REMOVE CASSANDRA'S DATA FILES AND COMMIT LOGS.

STEP 9: RESTART CASSANDRA.

STEP 10: Reiterate STEP 1 and STEP 2 to see that the bug has disappeared and the column value is fetched appropriately.

CONCLUSION: I have tried this one out with various consistency levels. The same thing happens. Next I'll try to insert and remove using other methods if the Thrift API allows for it.

I have included some of Cassandra's conf files so you can see how I've configured my setup. Perhaps I am doing something wrong there?