Since each row in my column family has 30 columns, wouldn't this translate to ~8,000 rows per second...or am I misunderstanding something.

Talking in terms of columns, my load test would seem to perform as follows:

100,000 rows / 26 sec * 30 columns/row = 115K columns per second.

That's on a dual core, 2.66 GHz laptop, 4GB RAM...single running cassandra node....hector (java) client.

Am I interpreting things correctly?

- Steve


On Tue, May 3, 2011 at 3:59 PM, aaron morton <aaron@thelastpickle.com> wrote:
To give an idea, last March (2010) I run the a much older Cassandra on 10 HP blades (dual socket, 4 core, 16GB, 2.5 laptop HDD) and was writing around 250K columns per second with 500 python processes loading the data from wikipedia running on another 10 HP blades.

This was my first out of the box no tuning (other then using sensible batch updates) test. Since then Cassandra has gotten much faster.

Hope that helps
Aaron

On 4 May 2011, at 02:22, Jonathan Ellis wrote:

> You don't give many details, but I would guess:
>
> - your benchmark is not multithreaded
> - mongodb is not configured for durable writes, so you're really only
> measuring the time for it to buffer it in memory
> - you haven't loaded enough data to hit "mongo's index doesn't fit in
> memory anymore"
>
> On Tue, May 3, 2011 at 8:24 AM, Steve Smith <stevenpsmith123@gmail.com> wrote:
>> I am working for client that needs to persist 100K-200K records per second
>> for later querying.  As a proof of concept, we are looking at several
>> options including nosql (Cassandra and MongoDB).
>> I have been running some tests on my laptop (MacBook Pro, 4GB RAM, 2.66 GHz,
>> Dual Core/4 logical cores) and have not been happy with the results.
>> The best I have been able to accomplish is 100K records in approximately 30
>> seconds.  Each record has 30 columns, mostly made up of integers.  I have
>> tried both the Hector and Pelops APIs, and have tried writing in batches
>> versus one at a time.  The times have not varied much.
>> I am using the out of the box configuration for Cassandra, and while I know
>> using 1 disk will have an impact on performance, I would expect to see
>> better write numbers than I am.
>> As a point of reference, the same test using MongoDB I was able to
>> accomplish 100K records in 3.5 seconds.
>> Any tips would be appreciated.
>>
>> - Steve
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com