incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Very new user needs some troubleshooting pointers
Date Fri, 09 Apr 2010 16:21:52 GMT
If you're only seeing 1-2 RPS then you should turn on debug logging to
see where the latency is.

On Fri, Apr 9, 2010 at 11:14 AM, Mark Jones <MJones@imagehawk.com> wrote:
> Sounds like we are some experiencing the same problems. (I’m using 0.6RC1) I
> have a 3 node cluster with 8GB/machine (dual core CPU).  I’m peaking on
> inserts at about 6000-7000/second running 40 threads.  Separate spindles for
> commitlog and data…..
>
>
>
> My read speed is atrocious, 800/sec sustained (starts off at 1800+/second
> and falls back to 800/sec).  Of course that is only if I read from the
> “correct” node.  Depending on the moment, 2 of the nodes will return
> 1-2/second instead of 800, and only one node will return 800/second.  And if
> I spread the reads across many nodes, all the performance drops.   nodetool
> loadbalance can change which node is the “golden” node, but I don’t know
> why.  I have doubled the # of concurrent read threads and seen some
> performance improvement, (that was the last thing I tried, and eeked out
> another 150/second)
>
>
>
> So much about Cassandra makes we WANT it to work, I mean look at the fact
> that all nodes are essentially equal, that it replicates from rack to rack,
> from DC to DC, now, if I could just make it perform.
>
>
>
> My machines are basically idle (a large amount of IOWait, but the time is
> spent in the pending queue, vs the device svctime).  So far I’ve got little
> insight into what could be wrong, I’ve increased the key cache 10X using
> JConsole but the hit rate is still at times abysmal.
>
>
>
> I’m writing 400-800 byte blobs with an 8 byte key (supercolumn) and a 12
> byte “subkey”, then a 5 byte column name, something that would seem to be
> right up Cassandra’s alley.
>
>
>
> Right now I’m reworking my test to dump it into MySQL on the same machines,
> so I can compare the two for speed, because either I’ve got crap for
> hardware, or there is something rotten in Denmark.
>
>
>
> From: Heath Oderman [mailto:heath@526valley.com]
> Sent: Friday, April 09, 2010 10:40 AM
> To: user@cassandra.apache.org
> Subject: Re: Very new user needs some troubleshooting pointers
>
>
>
> Thanks for the reply Jonathan!
>
>
>
> I started with multi threaded tests, but when my performance was so much
> slower than my buddy's I switched to one to try to isolate and identify the
> differences.  I got tunnel vision and kept on with the one thread tests.
>
>
>
> I'll modify the tests and try again.
>
>
>
> Thanks,
>
> Stu
>
>
>
> On Fri, Apr 9, 2010 at 11:34 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>
> A single-threaded test is meaningless.  You need a multithreaded (or
> multiprocess) benchmark like the one in contrib/py_stress.
>
> Picture worth 1000 words:
> http://spyced.blogspot.com/2010/01/cassandra-05.html
>
> On Thu, Apr 8, 2010 at 3:59 PM, Heath Oderman <heath@526valley.com> wrote:
>> Hi All,
>> I'm brand new to Cassandra and know absolutely nothing, so please forgive
>> me
>> in advance.
>> A friend and I have each setup a few Cassandra stand alone nodes,
>> completely
>> default.
>> His: Mac OSX Snow Leopard
>>      Mac Book Pro
>>      Intel Duo Core
>>      4GB Ram
>>      5400 rpm disk
>> Mine: debian 5.x (lenny) with the deb pack from
>> http://www.apache.org/dist/cassandra/debian
>>      2  Desktops
>>      Intel duo core
>>      4GB ram
>>      7200 sata drives
>>     1 blade
>>      8gb ram
>>      10000 rpm disk
>>      dual xeon
>>     (i have a windows box too like the 2 desktops)
>>
>>     (each of those machines is stand alone)
>>
>> My debian boxes are brand new installs, nothing else running, purely
>> console
>> environments, only SSH & Cassandra installed.
>> The Cassandra configs are the *default configs* with only 'ListenAddress'
>> and 'ThriftAddress' changed to the ext ip for those boxes.
>> We generated a C# library with Thrift to connect to these servers.  We
>> wrote
>> a simple c# app that loops 10,000 times and does a
>>          _client.batch_insert(_keyspace, map.Key.GetValue(o,
>> null).ToString(), dict, ConsistencyLevel.ONE);
>> "batch_insert" I guess is the key bit up there.
>> The reason that I'm writing is that the batch_insert call takes 400,000
>> ticks every time it is called when running against the debian boxes.  Any
>> of
>> them.
>> The result is that 10,000 inserts against his machine takes about 30
>> seconds, and it takes about 1 min 45 seconds against any of my servers.
>>  (longer against the windows 7 server.)
>> The MacBookPro is faster while I would expect to be slower.  (the macbook
>> pro is his laptop and he's running mail and all kinds of other stuff
>> simultaneously.)
>> I'm on a gigabit network, iostat / top / bmon all show that the Cassandra
>> server isn't working very hard.
>> Performance mon on my windows client show my computer running the loop is
>> hardly working.
>> I am writing to you to ask where I might go to get information on
>> comparing
>> the environments, improving my performance, etc.  I've been googling all
>> day
>> and haven't been able to figure anything out.
>> If this is the wrong forum, sorry!
>> Thanks for any help/suggestions you might have.
>> Stu
>>
>>
>>
>>
>
>

Mime
View raw message