Hello!

I'm new in cassandra son i can misunderstand some things.

In follow "benchmark". I have  insert 4000000 records like this

{"value": str(i), "text": "some small text"}

I use lazyboy lib (http://github.com/digg/lazyboy) to simplify work with cassandra thrift interface. So my insert python program look like this:

from lazyboy import *
from lazyboy.key import Key;

import time;
import random;

# Define your cluster(s)
connection.add_pool('test', ['localhost:9160'])

for j in xrange(0, 41):
  bt = time.time();
  begin = 100000 * j;

  for i in xrange(begin, begin + 100000):
    if (i != begin) and ((i % 10000) == 0):
      print time.time() - bt;
      bt = time.time()

    rec = record.Record();
    rec.key = Key("test", "Aquarium", str(i));

    rec.update({"value": str(i), "text": "ruslan text"})
    rec.save();

  print time.time() - bt;
  print "%s'th 100000 inserts done" % (j);

  time.sleep(10);


Then i try to fetch random records from my storage:

begin = time.time();

for i in xrange(0, 100000):
  if i and (i % 10000) == 0:
    print time.time() - begin;
    begin = time.time()

  rec = record.Record();
  rec.load(Key("test", "Aquarium", str(random.randint(0, 3000000))));

print time.time() - begin;


And on evry 10000 requests i get about 8 seconds:

8.04699993134
8.07800006866
8.18799996376
8.17199993134
8.15600013733
8.09399986267
8.07800006866
8.04699993134
8.06200003624
8.06299996376


Then i do similar test with MySQL on InnoDB storage engine, with follow program:

import MySQLdb as dbi;
from MySQLdb.cursors import *;

import time;
import random;
import sys;

g_dbh  = dbi.connect(db="test", user="root", passwd="root");
cursor = g_dbh.cursor();

begin = time.time();

for i in xrange(0, 100000):
  if i and (i % 10000) == 0:
    print time.time() - begin;
    begin = time.time()

  cursor.execute("select * from test where value=%s", random.randint(0, 3000000));
  cursor.fetchone();

print time.time() - begin;


And get about 1.5 seconds per 10000 requests:
1.54699993134
1.57800006852
1.18799996376
1.46671993134
1.76670013733
1.50399986267
1.57800003872
1.50699993134
1.50200003624
1.50099996313

Is it normal? Or i do something wrong.  i have that cassandra slow in 8/1.5 = 5.3 times less than Mysql InnoDB


 In cassandra i off all debugging, and my keyspace look like this:

  <Keyspaces>
    <Keyspace Name="test">
       <ColumnFamily CompareWith="BytesType" Name="Aquarium" />
    </Keyspace>
  </Keyspaces>


My innoDb table look like this:

CREATE TABLE `test` (
  `value` int(11) NOT NULL,
  `text` char(255) NOT NULL,
  PRIMARY KEY (`value`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8


In mysql i use TCP/IP connection to server not UNIX domain sockets. All test where done on Intel core 2 duo 8600 3Gz. On FreeBSD 7.2