incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nate McCall <n...@thelastpickle.com>
Subject Re: insert performance (1.2.8)
Date Thu, 22 Aug 2013 17:53:16 GMT
Also run iostat -x 5 before and after to get a feel for what's going on the
the storage.


On Thu, Aug 22, 2013 at 12:52 PM, Nate McCall <nate@thelastpickle.com>wrote:

> Given the backups in the flushing stages, I think you are IO bound. SSDs
> will work best for the data volume. Use rotational media for the commitlog
> as it is largely sequential.
>
> Quick experiment: disable commit log on the keyspace and see if your test
> goes faster ("WITH DURABLE_WRITES = false" on keyspace creation).
>
>
> On Wed, Aug 21, 2013 at 5:41 PM, Keith Freeman <8forty@gmail.com> wrote:
>
>>  We have 2 partitions on the same physical disk for commit-log and data.
>> Definitely non-optimal, we're planning to install SSDs for the commit-log
>> partition but don't have them yet.
>>
>> Can this explain the high server loads?
>>
>> On 08/21/2013 04:24 PM, Nate McCall wrote:
>>
>> What's the disk setup like on these system? You have some pending tasks
>> in MemtablePostFlusher and FlushWriter which may mean there is contention
>> on flushing discarded segments from the commit log.
>>
>>
>> On Wed, Aug 21, 2013 at 5:14 PM, Keith Freeman <8forty@gmail.com> wrote:
>>
>>>  Ok, I tried batching 500 at a time, made no noticeable difference in
>>> the server loads.  I have been monitoring JMX via jconsole if that's what
>>> you mean?  I also did tpstats on all 3 nodes while it was under load (the
>>> 5000 rows/sec test).  Attached file contains a screen shot of the JMX and
>>> the output of the 3 tpstats commands.
>>>
>>>
>>> On 08/21/2013 02:16 PM, Nate McCall wrote:
>>>
>>> The only thing I can think to suggest at this point is upping that batch
>>> size - say to 500 and see what happens.
>>>
>>>  Do you have any monitoring on this cluster? If not, what do you see as
>>> the output of 'nodetool tpstats' while you run this test?
>>>
>>>
>>> On Wed, Aug 21, 2013 at 1:40 PM, Keith Freeman <8forty@gmail.com> wrote:
>>>
>>>>  Building the giant batch string wasn't as bad as I thought, and at
>>>> first I had great(!) results (using "unlogged" batches): 2500 rows/sec
>>>> (batches of 100 in 48 threads) ran very smoothly, and the load on the
>>>> cassandra server nodes averaged about 1.0 or less continuously.
>>>>
>>>> But then I upped it to 5000 rows/sec, and the load on the server nodes
>>>> jumped to a continuous load on all 3 of 8-10 with peaks over 14.  I also
>>>> tried running 2 separate clients at 2500 rows/sec with the same results.
 I
>>>> don't see any compactions while at this load, so would this likely be the
>>>> result of GC thrashing?
>>>>
>>>> Seems like I'm spending a lot of effort and am still not getting very
>>>> close to being able to insert 10k rows (10M of data each) per second, which
>>>> is pretty disappointing.
>>>>
>>>>
>>>> On 08/20/2013 07:16 PM, Nate McCall wrote:
>>>>
>>>> Thrift will allow for more large, free-form batch contstruction. The
>>>> increase will be doing a lot more in the same payload message. Otherwise
>>>> CQL is more efficient.
>>>>
>>>>  If you do build those giant string, yes you should see a performance
>>>> improvement.
>>>>
>>>>
>>>> On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>
>>>>>  Thanks.  Can you tell me why would using thrift would improve
>>>>> performance?
>>>>>
>>>>> Also, if I do try to build those giant strings for a prepared batch
>>>>> statement, should I expect another performance improvement?
>>>>>
>>>>>
>>>>>
>>>>> On 08/20/2013 05:06 PM, Nate McCall wrote:
>>>>>
>>>>> Ugh - sorry, I knew Sylvain and Michaƫl had worked on this recently
>>>>> but it is only in 2.0 - I could have sworn it got marked for inclusion
>>>>> back into 1.2 but I was wrong:
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-4693
>>>>>
>>>>>  This is indeed an issue if you don't know the column count before
>>>>> hand (or had a very large number of them like in your case). Again,
>>>>> apologies, I would not have recommended that route if I knew it was only
in
>>>>> 2.0.
>>>>>
>>>>>  I would be willing to bet you could hit those insert numbers pretty
>>>>> easily with thrift given the shape of your mutation.
>>>>>
>>>>>
>>>>> On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>
>>>>>>  So I tried inserting prepared statements separately (no batch),
and
>>>>>> my server nodes load definitely dropped significantly.  Throughput
from my
>>>>>> client improved a bit, but only a few %.  I was able to *almost*
get 5000
>>>>>> rows/sec (sort of) by also reducing the rows/insert-thread to 20-50
and
>>>>>> eliminating all overhead from the timing, i.e. timing only the tight
for
>>>>>> loop of inserts.  But that's still a lot slower than I expected.
>>>>>>
>>>>>> I couldn't do batches because the driver doesn't allow prepared
>>>>>> statements in a batch (QueryBuilder API).  It appears the batch itself
>>>>>> could possibly be a prepared statement, but since I have 40+ columns
on
>>>>>> each insert that would take some ugly code to build so I haven't
tried it
>>>>>> yet.
>>>>>>
>>>>>> I'm using CL "ONE" on the inserts and RF 2 in my schema.
>>>>>>
>>>>>>
>>>>>> On 08/20/2013 08:04 AM, Nate McCall wrote:
>>>>>>
>>>>>> John makes a good point re:prepared statements (I'd increase batch
>>>>>> sizes again once you did this as well - separate, incremental runs
of
>>>>>> course so you can gauge the effect of each). That should take out
some of
>>>>>> the processing overhead of statement validation in the server (some
- that
>>>>>> load spike still seems high though).
>>>>>>
>>>>>>  I'd actually be really interested as to what your results were
>>>>>> after doing so - i've not tried any A/B testing here for prepared
>>>>>> statements on inserts.
>>>>>>
>>>>>>  Given your load is on the server, i'm not sure adding more async
>>>>>> indirection on the client would buy you too much though.
>>>>>>
>>>>>>  Also, at what RF and consistency level are you writing?
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>>
>>>>>>>  Ok, I'll try prepared statements.   But while sending my statements
>>>>>>> async might speed up my client, it wouldn't improve throughput
on the
>>>>>>> cassandra nodes would it?  They're running at pretty high loads
and only
>>>>>>> about 10% idle, so my concern is that they can't handle the data
any
>>>>>>> faster, so something's wrong on the server side.  I don't really
think
>>>>>>> there's anything on the client side that matters for this problem.
>>>>>>>
>>>>>>> Of course I know there are obvious h/w things I can do to improve
>>>>>>> server performance: SSDs, more RAM, more cores, etc.  But I thought
the
>>>>>>> servers I have would be able to handle more rows/sec than say
Mysql, since
>>>>>>> write speed is supposed to be one of Cassandra's strengths.
>>>>>>>
>>>>>>>
>>>>>>> On 08/19/2013 09:03 PM, John Sanda wrote:
>>>>>>>
>>>>>>> I'd suggest using prepared statements that you initialize at
>>>>>>> application start up and switching to use Session.executeAsync
coupled with
>>>>>>> Google Guava Futures API to get better throughput on the client
side.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>>>
>>>>>>>>  Sure, I've tried different numbers for batches and threads,
but
>>>>>>>> generally I'm running 10-30 threads at a time on the client,
each sending a
>>>>>>>> batch of 100 insert statements in every call, using the
>>>>>>>> QueryBuilder.batch() API from the latest datastax java driver,
then calling
>>>>>>>> the Session.execute() function (synchronous) on the Batch.
>>>>>>>>
>>>>>>>> I can't post my code, but my client does this on each iteration:
>>>>>>>> -- divides up the set of inserts by the number of threads
>>>>>>>> -- stores the current time
>>>>>>>> -- tells all the threads to send their inserts
>>>>>>>> -- then when they've all returned checks the elapsed time
>>>>>>>>
>>>>>>>> At about 2000 rows for each iteration, 20 threads with 100
inserts
>>>>>>>> each finish in about 1 second.  For 4000 rows, 40 threads
with 100 inserts
>>>>>>>> each finish in about 1.5 - 2 seconds, and as I said all 3
cassandra nodes
>>>>>>>> have a heavy CPU load while the client is hardly loaded.
 I've tried with
>>>>>>>> 10 threads and more inserts per batch, or up to 60 threads
with fewer,
>>>>>>>> doesn't seem to make a lot of difference.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/19/2013 05:00 PM, Nate McCall wrote:
>>>>>>>>
>>>>>>>>  How big are the batch sizes? In other words, how many rows
are
>>>>>>>> you sending per insert operation?
>>>>>>>>
>>>>>>>>  Other than the above, not much else to suggest without seeing
>>>>>>>> some example code (on pastebin, gist or similar, ideally).
>>>>>>>>
>>>>>>>> On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> I've got a 3-node cassandra cluster (16G/4-core VMs ESXi
v5 on
>>>>>>>>> 2.5Ghz machines not shared with any other VMs).  I'm
inserting time-series
>>>>>>>>> data into a single column-family using "wide rows" (timeuuids)
and have a
>>>>>>>>> 3-part partition key so my primary key is something like
((a, b, day),
>>>>>>>>> in-time-uuid), x, y, z).
>>>>>>>>>
>>>>>>>>> My java client is feeding rows (about 1k of raw data
size each) in
>>>>>>>>> batches using multiple threads, and the fastest I can
get it run reliably
>>>>>>>>> is about 2000 rows/second.  Even at that speed, all 3
cassandra nodes are
>>>>>>>>> very CPU bound, with loads of 6-9 each (and the client
machine is hardly
>>>>>>>>> breaking a sweat).  I've tried turning off compression
in my table which
>>>>>>>>> reduced the loads slightly but not much.  There are no
other updates or
>>>>>>>>> reads occurring, except the datastax opscenter.
>>>>>>>>>
>>>>>>>>> I was expecting to be able to insert at least 10k rows/second
with
>>>>>>>>> this configuration, and after a lot of reading of docs,
blogs, and google,
>>>>>>>>> can't really figure out what's slowing my client down.
 When I increase the
>>>>>>>>> insert speed of my client beyond 2000/second, the server
responses are just
>>>>>>>>> too slow and the client falls behind.  I had a single-node
Mysql database
>>>>>>>>> that can handle 10k of these data rows/second, so I really
feel like I'm
>>>>>>>>> missing something in Cassandra.  Any ideas?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>>
>>>>>>> - John
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>

Mime
View raw message