kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <...@confluent.io>
Subject Re: [VOTE] 0.10.0.0 RC4
Date Thu, 12 May 2016 23:24:28 GMT
Tom,

We don't have a CSV metrics reporter in the producer right now. The metrics
will be available in jmx. You can find out the details in
http://kafka.apache.org/documentation.html#new_producer_monitoring

Thanks,

Jun

On Thu, May 12, 2016 at 3:08 PM, Tom Crayford <tcrayford@heroku.com> wrote:

> Yep, I can try those particular commits tomorrow. Before I try a bisect,
> I'm going to replicate with a less intensive to iterate on smaller scale
> perf test.
>
> Jun, inline:
>
> On Thursday, 12 May 2016, Jun Rao <jun@confluent.io> wrote:
>
> > Tom,
> >
> > Thanks for reporting this. A few quick comments.
> >
> > 1. Did you send the right command for producer-perf? The command limits
> the
> > throughput to 100 msgs/sec. So, not sure how a single producer can get
> 75K
> > msgs/sec.
>
>
> Ah yep, wrong commands. I'll get the right one tomorrow. Sorry, was
> interpolating variables into a shell script.
>
>
> >
> > 2. Could you collect some stats (e.g. average batch size) in the producer
> > and see if there is any noticeable difference between 0.9 and 0.10?
>
>
> That'd just be hooking up the CSV metrics reporter right?
>
>
> >
> > 3. Is the broker-to-broker communication also on SSL? Could you do
> another
> > test with replication factor 1 and see if you still see the degradation?
>
>
> Interbroker replication is always SSL in all test runs so far. I can try
> with replication factor 1 tomorrow.
>
>
> >
> > Finally, email is probably not the best way to discuss performance
> results.
> > If you have more of them, could you create a jira and attach your
> findings
> > there?
>
>
> Yep. I only wrote the email because JIRA was in lockdown mode and I
> couldn't create new issues.
>
> >
> > Thanks,
> >
> > Jun
> >
> >
> >
> > On Thu, May 12, 2016 at 1:26 PM, Tom Crayford <tcrayford@heroku.com
> > <javascript:;>> wrote:
> >
> > > We've started running our usual suite of performance tests against
> Kafka
> > > 0.10.0.0 RC. These tests orchestrate multiple consumer/producer
> machines
> > to
> > > run a fairly normal mixed workload of producers and consumers (each
> > > producer/consumer are just instances of kafka's inbuilt
> consumer/producer
> > > perf tests). We've found about a 33% performance drop in the producer
> if
> > > TLS is used (compared to 0.9.0.1)
> > >
> > > We've seen notable producer performance degredations between 0.9.0.1
> and
> > > 0.10.0.0 RC. We're running as of the commit 9404680 right now.
> > >
> > > Our specific test case runs Kafka on 8 EC2 machines, with enhanced
> > > networking. Nothing is changed between the instances, and I've
> reproduced
> > > this over 4 different sets of clusters now. We're seeing about a 33%
> > > performance drop between 0.9.0.1 and 0.10.0.0 as of commit 9404680.
> > Please
> > > to note that this doesn't match up with
> > > https://issues.apache.org/jira/browse/KAFKA-3565, because our
> > performance
> > > tests are with compression off, and this seems to be an TLS only issue.
> > >
> > > Under 0.10.0-rc4, we see an 8 node cluster with replication factor of
> 3,
> > > and 13 producers max out at around 1 million 100 byte messages a
> second.
> > > Under 0.9.0.1, the same cluster does 1.5 million messages a second.
> Both
> > > tests were with TLS on. I've reproduced this on multiple clusters now
> (5
> > or
> > > so of each version) to account for the inherent performance variance of
> > > EC2. There's no notable performance difference without TLS on these
> runs
> > -
> > > it appears to be an TLS regression entirely.
> > >
> > > A single producer with TLS under 0.10 does about 75k messages/s. Under
> > > 0.9.0.01 it does around 120k messages/s.
> > >
> > > The exact producer-perf line we're using is this:
> > >
> > > bin/kafka-producer-perf-test --topic "bench" --num-records "500000000"
> > > --record-size "100" --throughput "100" --producer-props acks="-1"
> > > bootstrap.servers=REDACTED ssl.keystore.location=client.jks
> > > ssl.keystore.password=REDACTED ssl.truststore.location=server.jks
> > > ssl.truststore.password=REDACTED
> > > ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1 security.protocol=SSL
> > >
> > > We're using the same setup, machine type etc for each test run.
> > >
> > > We've tried using both 0.9.0.1 producers and 0.10.0.0 producers and the
> > TLS
> > > performance impact was there for both.
> > >
> > > I've glanced over the code between 0.9.0.1 and 0.10.0.0 and haven't
> seen
> > > anything that seemed to have this kind of impact - indeed the TLS code
> > > doesn't seem to have changed much between 0.9.0.1 and 0.10.0.0.
> > >
> > > Any thoughts? Should I file an issue and see about reproducing a more
> > > minimal test case?
> > >
> > > I don't think this is related to
> > > https://issues.apache.org/jira/browse/KAFKA-3565 - that is for
> > compression
> > > on and plaintext, and this is for TLS only.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message