incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ruslan usifov <ruslan.usi...@gmail.com>
Subject Re: Why cassandra single node so slow?
Date Sat, 14 Nov 2009 12:28:33 GMT
Nо. But i think that all data will be in memory and gstat (FreeBSD utility)
show that no any disk activity.

2009/11/14 TuxRacer69 <tuxracer69@gmail.com>

> Hi Ruslan,
>
> did you store the logs and the data on 2 different disks as described at:
> http://wiki.apache.org/cassandra/StorageConfiguration
> and
> http://wiki.apache.org/cassandra/FAQ#what_kind_of_hardware_should_i_use
> ?
>
> Cheers
> TuxRacer
>
>
> ruslan usifov wrote:
>
>> Hello!
>>
>> I'm new in cassandra son i can misunderstand some things.
>>
>> In follow "benchmark". I have  insert 4000000 records like this
>>
>> {"value": str(i), "text": "some small text"}
>>
>> I use lazyboy lib (http://github.com/digg/lazyboy) to simplify work with
>> cassandra thrift interface. So my insert python program look like this:
>>
>> from lazyboy import *
>> from lazyboy.key import Key;
>>
>> import time;
>> import random;
>>
>> # Define your cluster(s)
>> connection.add_pool('test', ['localhost:9160'])
>>
>> for j in xrange(0, 41):
>>  bt = time.time();
>>  begin = 100000 * j;
>>
>>  for i in xrange(begin, begin + 100000):
>>    if (i != begin) and ((i % 10000) == 0):
>>      print time.time() - bt;
>>      bt = time.time()
>>
>>    rec = record.Record();
>>    rec.key = Key("test", "Aquarium", str(i));
>>
>>    rec.update({"value": str(i), "text": "ruslan text"})
>>    rec.save();
>>
>>  print time.time() - bt;
>>  print "%s'th 100000 inserts done" % (j);
>>
>>  time.sleep(10);
>>
>>
>> Then i try to fetch random records from my storage:
>>
>> begin = time.time();
>>
>> for i in xrange(0, 100000):
>>  if i and (i % 10000) == 0:
>>    print time.time() - begin;
>>    begin = time.time()
>>
>>  rec = record.Record();
>>  rec.load(Key("test", "Aquarium", str(random.randint(0, 3000000))));
>>
>> print time.time() - begin;
>>
>>
>> And on evry 10000 requests i get about 8 seconds:
>>
>> 8.04699993134
>> 8.07800006866
>> 8.18799996376
>> 8.17199993134
>> 8.15600013733
>> 8.09399986267
>> 8.07800006866
>> 8.04699993134
>> 8.06200003624
>> 8.06299996376
>>
>>
>> Then i do similar test with MySQL on InnoDB storage engine, with follow
>> program:
>>
>> import MySQLdb as dbi;
>> from MySQLdb.cursors import *;
>>
>> import time;
>> import random;
>> import sys;
>>
>> g_dbh  = dbi.connect(db="test", user="root", passwd="root");
>> cursor = g_dbh.cursor();
>>
>> begin = time.time();
>>
>> for i in xrange(0, 100000):
>>  if i and (i % 10000) == 0:
>>    print time.time() - begin;
>>    begin = time.time()
>>
>>  cursor.execute("select * from test where value=%s", random.randint(0,
>> 3000000));
>>  cursor.fetchone();
>>
>> print time.time() - begin;
>>
>>
>> And get about 1.5 seconds per 10000 requests:
>> 1.54699993134
>> 1.57800006852
>> 1.18799996376
>> 1.46671993134
>> 1.76670013733
>> 1.50399986267
>> 1.57800003872
>> 1.50699993134
>> 1.50200003624
>> 1.50099996313
>>
>> Is it normal? Or i do something wrong.  i have that cassandra slow in
>> 8/1.5 = 5.3 times less than Mysql InnoDB
>>
>>
>>  In cassandra i off all debugging, and my keyspace look like this:
>>
>>  <Keyspaces>
>>    <Keyspace Name="test">
>>       <ColumnFamily CompareWith="BytesType" Name="Aquarium" />
>>    </Keyspace>
>>  </Keyspaces>
>>
>>
>> My innoDb table look like this:
>>
>> CREATE TABLE `test` (
>>  `value` int(11) NOT NULL,
>>  `text` char(255) NOT NULL,
>>  PRIMARY KEY (`value`)
>> ) ENGINE=InnoDB DEFAULT CHARSET=utf8
>>
>>
>> In mysql i use TCP/IP connection to server not UNIX domain sockets. All
>> test where done on Intel core 2 duo 8600 3Gz. On FreeBSD 7.2
>>
>>
>

Mime
View raw message