Return-Path: X-Original-To: apmail-incubator-kafka-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-kafka-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F5CBD6D7 for ; Thu, 1 Nov 2012 16:57:15 +0000 (UTC) Received: (qmail 14257 invoked by uid 500); 1 Nov 2012 16:57:14 -0000 Delivered-To: apmail-incubator-kafka-dev-archive@incubator.apache.org Received: (qmail 13599 invoked by uid 500); 1 Nov 2012 16:57:14 -0000 Mailing-List: contact kafka-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: kafka-dev@incubator.apache.org Delivered-To: mailing list kafka-dev@incubator.apache.org Received: (qmail 12550 invoked by uid 99); 1 Nov 2012 16:57:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 16:57:13 +0000 Date: Thu, 1 Nov 2012 16:57:13 +0000 (UTC) From: "Jay Kreps (JIRA)" To: kafka-dev@incubator.apache.org Message-ID: <121135671.56771.1351789033156.JavaMail.jiratomcat@arcas> In-Reply-To: <1279397415.1476.1349379467361.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (KAFKA-545) Add a Performance Suite for the Log subsystem MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/KAFKA-545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488832#comment-13488832 ] Jay Kreps commented on KAFKA-545: --------------------------------- Same test now with 20 seconds worth of data accumulating: [jkreps@jkreps-ld kafka-jbod]$ java -server -Xmx128M -Xms128M -XX:+UseConcMarkSweepGC -cp project/boot/scala-2.8.0/lib/scala-library.jar:core/target/scala_2.8.0/test-classes kafka.TestFileChannelReadLocking 500000 flushing flush completed in 0.497006 2.271428 11.766812 1.660411 1.861596 2.039938 1.278876 1.407181 1.130133 1.192209 1.663374 1.658432 1.124757 1.254995 1.848904 1.861381 1.158326 1.414888 1.240507 1.542315 1.543492 1.395788 1.128224 1.244737 1.323254 1.004004 1.508619 1.294839 1.237147 1.369261 1.500938 1.098796 1.140933 1.195621 0.825858 1.21719 flushing 1.187579 1.234125 0.981985 0.999659 1.05744 1.171083 flush completed in 2488.938675 1.219635 1.240126 1.192422 1.604653 1.412199 1.89463 1.282256 1.08756 1.360199 0.947128 1.130891 0.782065 1.453711 1.225088 1.704001 1.110982 1.155404 1.297822 1.450305 1.224275 1.272652 1.280408 1.23271 1.144039 1.273127 1.302072 1.408974 1.348525 1.556987 1.193373 1.407276 1.722947 1.443469 1.751133 flushing 1.288651 flush completed in 608.099163 1.520736 1.233443 1.553179 1.627624 1.613462 1.534873 1.508163 1.538743 1.489821 1.318509 1.537813 1.385722 1.06104 1.31107 1.232484 1.621071 1.63272 1.800139 1.311899 1.315283 1.552909 1.518307 1.384089 1.520744 1.762693 1.467796 1.699609 1.159155 1.469895 1.187978 1.830385 flushing 1.669841 1.341722 1.52613 flush completed in 2040.207681 1.202133 1.400995 1.077904 1.69022 1.055655 1.145438 1.535375 1.281362 1.168067 0.989543 1.162816 1.531742 1.296389 1.065467 Again, more or less constant time even though now we have 2 second flushes. > Add a Performance Suite for the Log subsystem > --------------------------------------------- > > Key: KAFKA-545 > URL: https://issues.apache.org/jira/browse/KAFKA-545 > Project: Kafka > Issue Type: New Feature > Affects Versions: 0.8 > Reporter: Jay Kreps > Priority: Blocker > Labels: features > Attachments: KAFKA-545-draft.patch > > > We have had several performance concerns or potential improvements for the logging subsystem. To conduct these in a data-driven way, it would be good to have a single-machine performance test that isolated the performance of the log. > The performance optimizations we would like to evaluate include > - Special casing appends in a follower which already have the correct offset to avoid decompression and recompression > - Memory mapping either all or some of the segment files to improve the performance of small appends and lookups > - Supporting multiple data directories and avoiding RAID > Having a standalone tool is nice to isolate the component and makes profiling more intelligible. > This test would drive load against Log/LogManager controlled by a set of command line options. These command line program could then be scripted up into a suite of tests that covered variations in message size, message set size, compression, number of partitions, etc. > Here is a proposed usage for the tool: > ./bin/kafka-log-perf-test.sh > Option Description > ------ ----------- > --partitions The number of partitions to write to > --dir The directory in which to write the log > --message-size The size of the messages > --set-size The number of messages per write > --compression Compression alg > --messages The number of messages to write > --readers The number of reader threads reading the data > The tool would capture latency and throughput for the append() and read() operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira