kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Guo -X (jungu - CIIC at Cisco)" <ju...@cisco.com>
Subject Abou Kafka 0.8 producer throughput test
Date Thu, 17 Jan 2013 04:33:25 GMT
Hi,
      I do producer(Kafka 0.8) throughput test many times. But the average value is 3MB/S.
Below is my test environment:
       CPU core      :16
       Vendor_id     :GenuineIntel
       Cpu family     :6
       Cpu MHz      :2899.999
       Cache size    :20480 KB
       Cpu level      :13
       MEM             :16330832KB=15.57GB
       Disk       : RAID5

       I don’t know the detail information about the disk, such as rotation. But I do some
test about the I/O performance of the disk. The write rate is 500MB~600MB/S, the read rate
is 180MB/S. The detail is as below.
[cid:image002.png@01CDF4AE.52046900]

And I adjust the broker configuration file as the official document says as below. And I adjust
the JVM to 5120MB.
I run producer performance test with the script kafka-producer-perf-test.sh, with the test
command is
bin/kafka-producer-perf-test.sh --broker-list 10.75.167.46:49092 --topics topic_perf_46_1,topic_perf_46_2,topic_perf_46_3,topic_perf_46_4,
topic_perf_46_5,topic_perf_46_6, topic_perf_46_7,topic_perf_46_8,topic_perf_46_9,topic_perf_46_10
--initial-message-id 0 --threads 200 --messages 1000000 --message-size 200 --compression-codec
1

But the test result is also not as good as the official document says(50MB/S, and that value
in your paper is 100MB/S). The test result is as below:
2013-01-17 04:15:24:768, 2013-01-17 04:25:01:637, 0, 200, 200, 1907.35, 3.3064, 10000000,
17334.9582

On the other hand, I do consumer throughput test, the result is about 60MB/S while that value
in official document is 100MB/S.
I really don’t know why?
You know high throughput is one of the most important features of Kafka. So I am really concerned
with it.

Thanks and best regards!

From: Jay Kreps [mailto:jkreps@linkedin.com]
Sent: 2013年1月16日 2:22
To: Jun Guo -X (jungu - CIIC at Cisco)
Subject: RE: About acknowledge from broker to producer in your paper.

Not sure which version you are using...

In 0.7 this would happen only if there was a socket level error (i.e. can't connect to the
host). This covers a lot of cases since in the event of I/O errors (disk full, etc) we just
have that node shut itself down to let others take over.

In 0.8 we send all errors back to the client.

So the difference is that, for example, in the event of a disk error, in 0.7 the client would
send a message, the broker would get an error and shoot itself in the head and disconnect
its clients, and the client would get the error the next time it tried to send a message.
So in 0.7 the error might not get passed back to the client until the second message send.
In 0.8 this would happen with the first send, which is an improvement.

-Jay
________________________________
From: Jun Guo -X (jungu - CIIC at Cisco) [jungu@cisco.com]
Sent: Monday, January 14, 2013 9:45 PM
To: Jay Kreps
Subject: About acknowledge from broker to producer in your paper.
Hi,
       I have read your paper Kafka: a Distributed Messaging System for Log Processing .
       In experimental results part. There are some words as below:

       There are a few reasons why Kafka performed much better. First, the Kafka producer
currently doesn’t wait for acknowledgements from the broker and sends messages as faster
as the broker can handle. This significantly increased the throughput of the publisher. With
a batch size of 50, a single Kafka producer almost saturated the 1Gb link between the producer
and the broker. This is a valid optimization for the log aggregation case, as data must be
sent asynchronously to avoid introducing any latency into the live serving of traffic. We
note that without acknowledging the producer, there is no guarantee that every published message
is actually received by the broker. For many types of log data, it is desirable to trade durability
for throughput, as long as the number of dropped messages is relatively small. However, we
do plan to
address the durability issue for more critical data in the future.

       But I have done a series of test. I found that ,if I shut down all the brokers, when
I send a message from producer to broker, the producer will report kafka.common.FailedToSendMessageException
. It says, Failed to send messages after 3 tries.
[cid:image003.png@01CDF4AE.D547ED00]
       If there is no acknowledge from broker, how the producer find the sending is failed?
And how it try 3 times?

       Maybe, the acknowledge in your paper refers to another thing, if so ,please tell what
is the meaning of acknowledge?

       Many thanks and best regards!

Guo Jun
Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message