incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nate McCall <n...@thelastpickle.com>
Subject Re: insert performance (1.2.8)
Date Thu, 22 Aug 2013 17:52:43 GMT
Given the backups in the flushing stages, I think you are IO bound. SSDs
will work best for the data volume. Use rotational media for the commitlog
as it is largely sequential.

Quick experiment: disable commit log on the keyspace and see if your test
goes faster ("WITH DURABLE_WRITES = false" on keyspace creation).


On Wed, Aug 21, 2013 at 5:41 PM, Keith Freeman <8forty@gmail.com> wrote:

>  We have 2 partitions on the same physical disk for commit-log and data.
> Definitely non-optimal, we're planning to install SSDs for the commit-log
> partition but don't have them yet.
>
> Can this explain the high server loads?
>
> On 08/21/2013 04:24 PM, Nate McCall wrote:
>
> What's the disk setup like on these system? You have some pending tasks in
> MemtablePostFlusher and FlushWriter which may mean there is contention on
> flushing discarded segments from the commit log.
>
>
> On Wed, Aug 21, 2013 at 5:14 PM, Keith Freeman <8forty@gmail.com> wrote:
>
>>  Ok, I tried batching 500 at a time, made no noticeable difference in the
>> server loads.  I have been monitoring JMX via jconsole if that's what you
>> mean?  I also did tpstats on all 3 nodes while it was under load (the 5000
>> rows/sec test).  Attached file contains a screen shot of the JMX and the
>> output of the 3 tpstats commands.
>>
>>
>> On 08/21/2013 02:16 PM, Nate McCall wrote:
>>
>> The only thing I can think to suggest at this point is upping that batch
>> size - say to 500 and see what happens.
>>
>>  Do you have any monitoring on this cluster? If not, what do you see as
>> the output of 'nodetool tpstats' while you run this test?
>>
>>
>> On Wed, Aug 21, 2013 at 1:40 PM, Keith Freeman <8forty@gmail.com> wrote:
>>
>>>  Building the giant batch string wasn't as bad as I thought, and at
>>> first I had great(!) results (using "unlogged" batches): 2500 rows/sec
>>> (batches of 100 in 48 threads) ran very smoothly, and the load on the
>>> cassandra server nodes averaged about 1.0 or less continuously.
>>>
>>> But then I upped it to 5000 rows/sec, and the load on the server nodes
>>> jumped to a continuous load on all 3 of 8-10 with peaks over 14.  I also
>>> tried running 2 separate clients at 2500 rows/sec with the same results.  I
>>> don't see any compactions while at this load, so would this likely be the
>>> result of GC thrashing?
>>>
>>> Seems like I'm spending a lot of effort and am still not getting very
>>> close to being able to insert 10k rows (10M of data each) per second, which
>>> is pretty disappointing.
>>>
>>>
>>> On 08/20/2013 07:16 PM, Nate McCall wrote:
>>>
>>> Thrift will allow for more large, free-form batch contstruction. The
>>> increase will be doing a lot more in the same payload message. Otherwise
>>> CQL is more efficient.
>>>
>>>  If you do build those giant string, yes you should see a performance
>>> improvement.
>>>
>>>
>>> On Tue, Aug 20, 2013 at 8:03 PM, Keith Freeman <8forty@gmail.com> wrote:
>>>
>>>>  Thanks.  Can you tell me why would using thrift would improve
>>>> performance?
>>>>
>>>> Also, if I do try to build those giant strings for a prepared batch
>>>> statement, should I expect another performance improvement?
>>>>
>>>>
>>>>
>>>> On 08/20/2013 05:06 PM, Nate McCall wrote:
>>>>
>>>> Ugh - sorry, I knew Sylvain and Michaƫl had worked on this recently
>>>> but it is only in 2.0 - I could have sworn it got marked for inclusion
>>>> back into 1.2 but I was wrong:
>>>> https://issues.apache.org/jira/browse/CASSANDRA-4693
>>>>
>>>>  This is indeed an issue if you don't know the column count before
>>>> hand (or had a very large number of them like in your case). Again,
>>>> apologies, I would not have recommended that route if I knew it was only
in
>>>> 2.0.
>>>>
>>>>  I would be willing to bet you could hit those insert numbers pretty
>>>> easily with thrift given the shape of your mutation.
>>>>
>>>>
>>>> On Tue, Aug 20, 2013 at 5:00 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>
>>>>>  So I tried inserting prepared statements separately (no batch), and
>>>>> my server nodes load definitely dropped significantly.  Throughput from
my
>>>>> client improved a bit, but only a few %.  I was able to *almost* get
5000
>>>>> rows/sec (sort of) by also reducing the rows/insert-thread to 20-50 and
>>>>> eliminating all overhead from the timing, i.e. timing only the tight
for
>>>>> loop of inserts.  But that's still a lot slower than I expected.
>>>>>
>>>>> I couldn't do batches because the driver doesn't allow prepared
>>>>> statements in a batch (QueryBuilder API).  It appears the batch itself
>>>>> could possibly be a prepared statement, but since I have 40+ columns
on
>>>>> each insert that would take some ugly code to build so I haven't tried
it
>>>>> yet.
>>>>>
>>>>> I'm using CL "ONE" on the inserts and RF 2 in my schema.
>>>>>
>>>>>
>>>>> On 08/20/2013 08:04 AM, Nate McCall wrote:
>>>>>
>>>>> John makes a good point re:prepared statements (I'd increase batch
>>>>> sizes again once you did this as well - separate, incremental runs of
>>>>> course so you can gauge the effect of each). That should take out some
of
>>>>> the processing overhead of statement validation in the server (some -
that
>>>>> load spike still seems high though).
>>>>>
>>>>>  I'd actually be really interested as to what your results were after
>>>>> doing so - i've not tried any A/B testing here for prepared statements
on
>>>>> inserts.
>>>>>
>>>>>  Given your load is on the server, i'm not sure adding more async
>>>>> indirection on the client would buy you too much though.
>>>>>
>>>>>  Also, at what RF and consistency level are you writing?
>>>>>
>>>>>
>>>>> On Tue, Aug 20, 2013 at 8:56 AM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>
>>>>>>  Ok, I'll try prepared statements.   But while sending my statements
>>>>>> async might speed up my client, it wouldn't improve throughput on
the
>>>>>> cassandra nodes would it?  They're running at pretty high loads and
only
>>>>>> about 10% idle, so my concern is that they can't handle the data
any
>>>>>> faster, so something's wrong on the server side.  I don't really
think
>>>>>> there's anything on the client side that matters for this problem.
>>>>>>
>>>>>> Of course I know there are obvious h/w things I can do to improve
>>>>>> server performance: SSDs, more RAM, more cores, etc.  But I thought
the
>>>>>> servers I have would be able to handle more rows/sec than say Mysql,
since
>>>>>> write speed is supposed to be one of Cassandra's strengths.
>>>>>>
>>>>>>
>>>>>> On 08/19/2013 09:03 PM, John Sanda wrote:
>>>>>>
>>>>>> I'd suggest using prepared statements that you initialize at
>>>>>> application start up and switching to use Session.executeAsync coupled
with
>>>>>> Google Guava Futures API to get better throughput on the client side.
>>>>>>
>>>>>>
>>>>>> On Mon, Aug 19, 2013 at 10:14 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>>
>>>>>>>  Sure, I've tried different numbers for batches and threads,
but
>>>>>>> generally I'm running 10-30 threads at a time on the client,
each sending a
>>>>>>> batch of 100 insert statements in every call, using the
>>>>>>> QueryBuilder.batch() API from the latest datastax java driver,
then calling
>>>>>>> the Session.execute() function (synchronous) on the Batch.
>>>>>>>
>>>>>>> I can't post my code, but my client does this on each iteration:
>>>>>>> -- divides up the set of inserts by the number of threads
>>>>>>> -- stores the current time
>>>>>>> -- tells all the threads to send their inserts
>>>>>>> -- then when they've all returned checks the elapsed time
>>>>>>>
>>>>>>> At about 2000 rows for each iteration, 20 threads with 100 inserts
>>>>>>> each finish in about 1 second.  For 4000 rows, 40 threads with
100 inserts
>>>>>>> each finish in about 1.5 - 2 seconds, and as I said all 3 cassandra
nodes
>>>>>>> have a heavy CPU load while the client is hardly loaded.  I've
tried with
>>>>>>> 10 threads and more inserts per batch, or up to 60 threads with
fewer,
>>>>>>> doesn't seem to make a lot of difference.
>>>>>>>
>>>>>>>
>>>>>>> On 08/19/2013 05:00 PM, Nate McCall wrote:
>>>>>>>
>>>>>>>  How big are the batch sizes? In other words, how many rows are
you
>>>>>>> sending per insert operation?
>>>>>>>
>>>>>>>  Other than the above, not much else to suggest without seeing
some
>>>>>>> example code (on pastebin, gist or similar, ideally).
>>>>>>>
>>>>>>> On Mon, Aug 19, 2013 at 5:49 PM, Keith Freeman <8forty@gmail.com>wrote:
>>>>>>>
>>>>>>>> I've got a 3-node cassandra cluster (16G/4-core VMs ESXi
v5 on
>>>>>>>> 2.5Ghz machines not shared with any other VMs).  I'm inserting
time-series
>>>>>>>> data into a single column-family using "wide rows" (timeuuids)
and have a
>>>>>>>> 3-part partition key so my primary key is something like
((a, b, day),
>>>>>>>> in-time-uuid), x, y, z).
>>>>>>>>
>>>>>>>> My java client is feeding rows (about 1k of raw data size
each) in
>>>>>>>> batches using multiple threads, and the fastest I can get
it run reliably
>>>>>>>> is about 2000 rows/second.  Even at that speed, all 3 cassandra
nodes are
>>>>>>>> very CPU bound, with loads of 6-9 each (and the client machine
is hardly
>>>>>>>> breaking a sweat).  I've tried turning off compression in
my table which
>>>>>>>> reduced the loads slightly but not much.  There are no other
updates or
>>>>>>>> reads occurring, except the datastax opscenter.
>>>>>>>>
>>>>>>>> I was expecting to be able to insert at least 10k rows/second
with
>>>>>>>> this configuration, and after a lot of reading of docs, blogs,
and google,
>>>>>>>> can't really figure out what's slowing my client down.  When
I increase the
>>>>>>>> insert speed of my client beyond 2000/second, the server
responses are just
>>>>>>>> too slow and the client falls behind.  I had a single-node
Mysql database
>>>>>>>> that can handle 10k of these data rows/second, so I really
feel like I'm
>>>>>>>> missing something in Cassandra.  Any ideas?
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>>>
>>>>>> - John
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Mime
View raw message