cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Freeman <>
Subject Re: insert performance (1.2.8)
Date Wed, 21 Aug 2013 18:40:57 GMT
Building the giant batch string wasn't as bad as I thought, and at first 
I had great(!) results (using "unlogged" batches): 2500 rows/sec 
(batches of 100 in 48 threads) ran very smoothly, and the load on the 
cassandra server nodes averaged about 1.0 or less continuously.

But then I upped it to 5000 rows/sec, and the load on the server nodes 
jumped to a continuous load on all 3 of 8-10 with peaks over 14.  I also 
tried running 2 separate clients at 2500 rows/sec with the same 
results.  I don't see any compactions while at this load, so would this 
likely be the result of GC thrashing?

Seems like I'm spending a lot of effort and am still not getting very 
close to being able to insert 10k rows (10M of data each) per second, 
which is pretty disappointing.

On 08/20/2013 07:16 PM, Nate McCall wrote:
> Thrift will allow for more large, free-form batch contstruction. The 
> increase will be doing a lot more in the same payload message. 
> Otherwise CQL is more efficient.
> If you do build those giant string, yes you should see a performance 
> improvement.
> On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman < 
> <>> wrote:
>     Thanks.  Can you tell me why would using thrift would improve
>     performance?
>     Also, if I do try to build those giant strings for a prepared
>     batch statement, should I expect another performance improvement?
>     On 08/20/2013 05:06 PM, Nate McCall wrote:
>>     Ugh - sorry, I knew Sylvain and Michaƫl had worked on this
>>     recently but it is only in 2.0 - I could have sworn it got marked
>>     for inclusion back into 1.2 but I was wrong:
>>     This is indeed an issue if you don't know the column count before
>>     hand (or had a very large number of them like in your case).
>>     Again, apologies, I would not have recommended that route if I
>>     knew it was only in 2.0.
>>     I would be willing to bet you could hit those insert numbers
>>     pretty easily with thrift given the shape of your mutation.
>>     On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman <
>>     <>> wrote:
>>         So I tried inserting prepared statements separately (no
>>         batch), and my server nodes load definitely dropped
>>         significantly. Throughput from my client improved a bit, but
>>         only a few %.  I was able to *almost* get 5000 rows/sec (sort
>>         of) by also reducing the rows/insert-thread to 20-50 and
>>         eliminating all overhead from the timing, i.e. timing only
>>         the tight for loop of inserts.  But that's still a lot slower
>>         than I expected.
>>         I couldn't do batches because the driver doesn't allow
>>         prepared statements in a batch (QueryBuilder API).  It
>>         appears the batch itself could possibly be a prepared
>>         statement, but since I have 40+ columns on each insert that
>>         would take some ugly code to build so I haven't tried it yet.
>>         I'm using CL "ONE" on the inserts and RF 2 in my schema.
>>         On 08/20/2013 08:04 AM, Nate McCall wrote:
>>>         John makes a good point re:prepared statements (I'd increase
>>>         batch sizes again once you did this as well - separate,
>>>         incremental runs of course so you can gauge the effect of
>>>         each). That should take out some of the processing overhead
>>>         of statement validation in the server (some - that load
>>>         spike still seems high though).
>>>         I'd actually be really interested as to what your results
>>>         were after doing so - i've not tried any A/B testing here
>>>         for prepared statements on inserts.
>>>         Given your load is on the server, i'm not sure adding more
>>>         async indirection on the client would buy you too much though.
>>>         Also, at what RF and consistency level are you writing?
>>>         On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman
>>>         < <>> wrote:
>>>             Ok, I'll try prepared statements.   But while sending my
>>>             statements async might speed up my client, it wouldn't
>>>             improve throughput on the cassandra nodes would it? 
>>>             They're running at pretty high loads and only about 10%
>>>             idle, so my concern is that they can't handle the data
>>>             any faster, so something's wrong on the server side.  I
>>>             don't really think there's anything on the client side
>>>             that matters for this problem.
>>>             Of course I know there are obvious h/w things I can do
>>>             to improve server performance: SSDs, more RAM, more
>>>             cores, etc.  But I thought the servers I have would be
>>>             able to handle more rows/sec than say Mysql, since write
>>>             speed is supposed to be one of Cassandra's strengths.
>>>             On 08/19/2013 09:03 PM, John Sanda wrote:
>>>>             I'd suggest using prepared statements that you
>>>>             initialize at application start up and switching to use
>>>>             Session.executeAsync coupled with Google Guava Futures
>>>>             API to get better throughput on the client side.
>>>>             On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman
>>>>             < <>> wrote:
>>>>                 Sure, I've tried different numbers for batches and
>>>>                 threads, but generally I'm running 10-30 threads at
>>>>                 a time on the client, each sending a batch of 100
>>>>                 insert statements in every call, using the
>>>>                 QueryBuilder.batch() API from the latest datastax
>>>>                 java driver, then calling the Session.execute()
>>>>                 function (synchronous) on the Batch.
>>>>                 I can't post my code, but my client does this on
>>>>                 each iteration:
>>>>                 -- divides up the set of inserts by the number of
>>>>                 threads
>>>>                 -- stores the current time
>>>>                 -- tells all the threads to send their inserts
>>>>                 -- then when they've all returned checks the
>>>>                 elapsed time
>>>>                 At about 2000 rows for each iteration, 20 threads
>>>>                 with 100 inserts each finish in about 1 second. 
>>>>                 For 4000 rows, 40 threads with 100 inserts each
>>>>                 finish in about 1.5 - 2 seconds, and as I said all
>>>>                 3 cassandra nodes have a heavy CPU load while the
>>>>                 client is hardly loaded.  I've tried with 10
>>>>                 threads and more inserts per batch, or up to 60
>>>>                 threads with fewer, doesn't seem to make a lot of
>>>>                 difference.
>>>>                 On 08/19/2013 05:00 PM, Nate McCall wrote:
>>>>>                 How big are the batch sizes? In other words, how
>>>>>                 many rows are you sending per insert operation?
>>>>>                 Other than the above, not much else to suggest
>>>>>                 without seeing some example code (on pastebin,
>>>>>                 gist or similar, ideally).
>>>>>                 On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman
>>>>>                 < <>>
>>>>>                     I've got a 3-node cassandra cluster
>>>>>                     (16G/4-core VMs ESXi v5 on 2.5Ghz machines not
>>>>>                     shared with any other VMs).  I'm inserting
>>>>>                     time-series data into a single column-family
>>>>>                     using "wide rows" (timeuuids) and have a
>>>>>                     3-part partition key so my primary key is
>>>>>                     something like ((a, b, day), in-time-uuid), x,
>>>>>                     y, z).
>>>>>                     My java client is feeding rows (about 1k of
>>>>>                     raw data size each) in batches using multiple
>>>>>                     threads, and the fastest I can get it run
>>>>>                     reliably is about 2000 rows/second.  Even at
>>>>>                     that speed, all 3 cassandra nodes are very CPU
>>>>>                     bound, with loads of 6-9 each (and the client
>>>>>                     machine is hardly breaking a sweat).  I've
>>>>>                     tried turning off compression in my table
>>>>>                     which reduced the loads slightly but not much.
>>>>>                      There are no other updates or reads
>>>>>                     occurring, except the datastax opscenter.
>>>>>                     I was expecting to be able to insert at least
>>>>>                     10k rows/second with this configuration, and
>>>>>                     after a lot of reading of docs, blogs, and
>>>>>                     google, can't really figure out what's slowing
>>>>>                     my client down.  When I increase the insert
>>>>>                     speed of my client beyond 2000/second, the
>>>>>                     server responses are just too slow and the
>>>>>                     client falls behind.  I had a single-node
>>>>>                     Mysql database that can handle 10k of these
>>>>>                     data rows/second, so I really feel like I'm
>>>>>                     missing something in Cassandra.  Any ideas?
>>>>             -- 
>>>>             - John

View raw message