Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 128B310A95 for ; Thu, 29 Aug 2013 05:07:06 +0000 (UTC) Received: (qmail 9472 invoked by uid 500); 29 Aug 2013 05:07:05 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 9414 invoked by uid 500); 29 Aug 2013 05:07:04 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 9406 invoked by uid 99); 29 Aug 2013 05:07:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Aug 2013 05:07:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jay.kreps@gmail.com designates 209.85.217.180 as permitted sender) Received: from [209.85.217.180] (HELO mail-lb0-f180.google.com) (209.85.217.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Aug 2013 05:06:53 +0000 Received: by mail-lb0-f180.google.com with SMTP id q8so289914lbi.25 for ; Wed, 28 Aug 2013 22:06:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=caOVluieI50iW/rqz0/IWeKZb5KL2BdWv5w6TfI2shM=; b=GIXTqadoLFjnZ8k7JCs85+9yU1AfI839PKLy9ZVVd89kK+r6lzZcvdvkuoaNoZ+YIx soBj9+FjVy9rmZQnU/2Q3rMWtGdDp64qLh+VGBZ7YeUnm/G5qg1Ql+4c6IeKrng3r4r/ HNlqgWSTOaV+3ys8+F8gC0X2PI6/o0dbc0vZ2u9EzC9ROOHSx5YYj0OcHtKpLQ9Dimnf wEMx8zL5h9YBTDc96iNlIzFNg7sFhBXa4RyWJ4xWmOqpmiGHvbRSCNJLcKAPVwPUKqoQ ktjx/aGrnmg8GvhPOO7LRs6lATV1FqpqXk6fETjbAfs7RuYsdqqcNsT5olZ3H6qdqEF9 3ZIw== MIME-Version: 1.0 X-Received: by 10.152.19.70 with SMTP id c6mr1350065lae.25.1377752792817; Wed, 28 Aug 2013 22:06:32 -0700 (PDT) Received: by 10.114.199.179 with HTTP; Wed, 28 Aug 2013 22:06:32 -0700 (PDT) Date: Wed, 28 Aug 2013 22:06:32 -0700 Message-ID: Subject: config documentation From: Jay Kreps To: "dev@kafka.apache.org" Content-Type: multipart/alternative; boundary=089e013d1b9ccde6ac04e50f11d0 X-Virus-Checked: Checked by ClamAV on apache.org --089e013d1b9ccde6ac04e50f11d0 Content-Type: text/plain; charset=ISO-8859-1 I took a pass through the configuration docs ( http://localhost/documentation.html#configuration) and tried to make them more understandable. We really could do better at this. If we are going to take the time to make a configuration we should bother to explain exactly what it does and tell people how to use it properly. It would be really great if others could do a pass through and sanity check the docs with an eye towards the user. Here is an example of bad configuration documentation: property: fetch.purgatory.purge.interval.requests default: 10000 description: The purge interval (in number of requests) of the fetch request purgatory What is a request purgatory? What is a purge interval? What does it mean to purge a request purgatory? How should I set this value? Or property: consumer.id default: null description: Generated automatically if not set. This kind of fails to answer the obvious question: what is a consumer id and why might I want to set one? I think this actually accounts for a fairly high percentage of the support load as well as just scaring people away. Here is a complete diff of my changes: jkreps-mn:kafka-site jkreps$ svn diff Index: 08/configuration.html =================================================================== --- 08/configuration.html (revision 1518456) +++ 08/configuration.html (working copy) @@ -1,3 +1,7 @@ +Kafka uses the property file format for configuration. These can be supplied either from a file or programmatically. +

+Some configurations have both a default global setting as well as a topic-level overrides. The topic level properties have the format of csv (e.g., "xyz.per.topic=topic1:value1,topic2:value2") and they override the default value for the specified topics. +

3.1 Broker Configs

The essential configurations are the following:
    @@ -6,8 +10,6 @@
  • zookeeper.connect
-Note that some configurations have both a default global setting as well as a topic level setting. The topic level properties have the format of csv (e.g., "topic1:value1,topic2:value2") and they override the values in the global setting for those specified topics. - @@ -17,14 +19,20 @@ - + - + + + + + + - + - + - + - + - - - - - - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + @@ -177,74 +180,74 @@ - + - + - + - + - + - + - + - + - + - + - + - + - + @@ -259,16 +262,17 @@ - + - +
Property
broker.id Each broker is uniquely identified by a non-negative integer id. This id serves as the brokers "name", and allows the broker to be moved to a different host/port without confusing consumers.Each broker is uniquely identified by a non-negative integer id. This id serves as the brokers "name", and allows the broker to be moved to a different host/port without confusing consumers. You can choose any number you like so long as it is unique. +
log.dirs /tmp/kafka-logsThe directories in which the log data is keptA comma-separated list of one or more directories in which Kafka data is stored. Each new partition that is created will be placed in the directory which currently has the fewest partitions.
port6667The port on which the server accepts client connections.
zookeeper.connect null Specifies the zookeeper connection string in the form hostname:port, where hostname and port are the host and port for a node in your zookeeper cluster. To allow connecting through other zookeeper nodes when that host is down you can also specify multiple hosts in the form hostname1:port1,hostname2:port2,hostname3:port3. @@ -34,140 +42,135 @@
message.max.bytes 1000000The maximum size of a message that the server can receiveThe maximum size of a message that the server can receive. It is important that this property be in sync with the maximum fetch size your consumers use or else an unruly consumer will be able to publish messages too large for consumers to consume.
num.network.threads 3The number of network threads that the server uses for handling network requestsThe number of network threads that the server uses for handling network requests. You probably don't need to change this.
num.io.threads 8The number of io threads that the server uses for carrying out network requestsThe number of I/O threads that the server uses for executing requests. You should have at least as many threads as you have disks.
queued.max.requests 500The number of queued requests allowed before blocking the network threadsThe number of requests that can be queued up for processing by the I/O threads before the network threads stop reading in new requests.
port6667The port to listen and accept connections on
host.name null -

Hostname of broker. If this is set, it will only bind to this address. If this is not set, it will bind to all interfaces, and publish one to ZK

+

Hostname of broker. If this is set, it will only bind to this address. If this is not set, it will bind to all interfaces, and publish one to ZK.

socket.send.buffer.bytes 100 * 1024The SO_SNDBUFF buffer of the socket sever socketsThe SO_SNDBUFF buffer the server prefers for socket connections.
socket.receive.buffer.bytes 100 * 1024The SO_RCVBUFF buffer of the socket sever socketsThe SO_RCVBUFF buffer the server prefers for socket connections.
socket.request.max.bytes 100 * 1024 * 1024The maximum number of bytes in a socket requestThe maximum request size the server will allow. This prevents the server from running out of memory and should be smaller than the Java heap size.
num.partitions 1The default number of log partitions per topicThe default number of partitions per topic.
log.segment.bytes 1024 * 1024 * 1024The maximum size of a single log fileThe log for a topic partition is stored as a directory of segment files. This setting controls the size to which a segment file will grow before a new segment is rolled over in the log.
log.segment.bytes.per.topic ""The maximum size of a single log file for some specific topicThis setting allows overriding log.segment.bytes on a per-topic basis
log.roll.hours 24 * 7The maximum time before a new log segment is rolled outThis setting will force Kafka to roll a new log segment even if the log.segment.bytes size has not been reached.
log.roll.hours.per.topic ""The number of hours before rolling out a new log segment for some specific topicThis setting allows overriding log.roll.hours on a per-topic basis.
log.retention.hours 24 * 7The number of hours to keep a log file before deleting itThe number of hours to keep a log segment before it is deleted, i.e. the default data retention window for all topics. Note that if both log.retention.hours and log.retention.bytes are both set we delete a segment when either limit is exceeded.
log.retention.hours.per.topic ""The number of hours to keep a log file before deleting it for some specific topicA per-topic override for log.retention.hours.
log.retention.bytes -1The maximum size of the log per partitionThe amount of data to retain in the log for each topic-partitions. Note that this is the limit per-partition so multiple by the number of partitions to get the total data retained for the topic. Also note that if both log.retention.hours and log.retention.bytes are both set we delete a segment when either limit is exceeded.
log.retention.bytes.per.topic ""The maximum size of the log for each partition in some specific topicsA per-topic override for log.retention.bytes.
log.cleanup.interval.mins 10The frequency in minutes that the log cleaner checks whether any log is eligible for deletionThe frequency in minutes that the log cleaner checks whether any log segment is eligible for deletion to meet the retention policies.
log.index.size.max.bytes 10 * 1024 * 1024The maximum size in bytes of the offset indexThe maximum size in bytes we allow for the offset index for each log segment. Note that we will always pre-allocate a sparse file with this much space and shrink it down when the log rolls. If the index fills up we will roll a new log segment even if we haven't reached the log.segment.bytes limit.
log.index.interval.bytes 4096The interval with which we add an entry to the offset indexThe byte interval at which we add an entry to the offset index. When executing a fetch request the server must do a linear scan for up to this many bytes to find the correct position in the log to begin and end the fetch. So setting this value to be larger will mean larger index files (and a bit more memory usage) but less scanning. However the server will never add more than one index entry per log append (even if more than log.index.interval worth of messages are appended). In general you probably don't need to mess with this value.
log.flush.interval.messages 10000The number of messages accumulated on a log partition before messages are flushed to diskThe number of messages written to a log partition before we force an fsync on the log. Setting this higher will improve performance a lot but will increase the window of data at risk in the event of a crash (though that is usually best addressed through replication). If both this setting and log.flush.interval.ms are both used the log will be flushed when either criteria is met.
log.flush.interval.ms.per.topic ""The maximum time in ms that a message in selected topics is kept in memory before flushed to disk, e.g., topic1:3000,topic2:6000The per-topic override for log.flush.interval.messages, e.g., topic1:3000,topic2:6000
log.flush.scheduler.interval.ms 3000The frequency in ms that the log flusher checks whether any log needs to be flushed to diskThe frequency in ms that the log flusher checks whether any log is eligible to be flushed to disk.
log.flush.interval.ms 3000 The maximum time in ms that a message in any topic is kept in memory before flushed to diskThe maximum time between fsync calls on the log. If used in conjuction with log.flush.interval.messages the log will be flushed when either criteria is met.
auto.create.topics.enable trueEnable auto creation of topic on the serverEnable auto creation of topic on the server. If this is set to true then attempts to produce, consume, or fetch metadata for a non-existent topic will automatically create it with the default replication factor and number of partitions.
controller.socket.timeout.ms 30000The socket timeout for controller-to-broker channelsThe socket timeout for commands from the partition management controller to the replicas.
controller.message.queue.size
default.replication.factor 1Default replication factors for automatically created topicsThe default replication factor for automatically created topics.
replica.lag.time.max.ms 10000If a follower hasn't sent any fetch requests during this time, the leader will remove the follower from isrIf a follower hasn't sent any fetch requests for this window of time, the leader will remove the follower from ISR and treat it as dead.
replica.lag.max.messages 4000If the lag in messages between a leader and a follower exceeds this number, the leader will remove the follower from isrIf a replica falls more than this many messages behind the leader, the leader will remove the follower from ISR and treat it as dead.
replica.socket.timeout.ms 30 * 1000The socket timeout for network requestsThe socket timeout for network requests to the leader for replicating data.
replica.socket.receive.buffer.bytes 64 * 1024The socket receive buffer for network requestsThe socket receive buffer for network requests to the leader for replicating data.
replica.fetch.max.bytes 1024 * 1024The number of byes of messages to attempt to fetchThe number of byes of messages to attempt to fetch for each partition in the fetch requests the replicas send to the leader.
replica.fetch.wait.max.ms 500Max wait time for each fetcher request issued by follower replicasThe maximum amount of time to wait time for data to arrive on the leader in the fetch requests sent by the replicas to the leader.
replica.fetch.min.bytes 1Minimum bytes expected for each fetch response. If not enough bytes, wait up to replicaMaxWaitTimeMsMinimum bytes expected for each fetch response for the fetch requests from the replica to the leader. If not enough bytes, wait up to replica.fetch.wait.max.ms for this many bytes to arrive.
num.replica.fetchers 1 -

Number of fetcher threads used to replicate messages from a source broker. Increasing this value can increase the degree of I/O parallelism in the follower broker.

+

Number of threads used to replicate messages from leaders. Increasing this value can increase the degree of I/O parallelism in the follower broker.

replica.high.watermark.checkpoint.interval.ms 5000The frequency with which the high watermark is saved out to diskThe frequency with which each replica saves its high watermark to disk to handle recovery.
fetch.purgatory.purge.interval.requests 10000The purge interval (in number of requests) of the fetch request purgatoryThe purge interval (in number of requests) of the fetch request purgatory.
producer.purgatory.purge.interval.requests 10000The purge interval (in number of requests) of the producer request purgatoryThe purge interval (in number of requests) of the producer request purgatory.
zookeeper.session.timeout.ms 6000Zookeeper session timeoutZookeeper session timeout. If the server fails to heartbeat to zookeeper within this period of time it is considered dead. If you set this too low the server may be falsely considered dead; if you set it too high it may take too long to recognize a truly dead server.
zookeeper.connection.timeout.ms 6000The max time that the client waits to establish a connection to zookeeperThe max time that the client waits to establish a connection to zookeeper.
zookeeper.sync.time.ms
controlled.shutdown.max.retries 3Number of retries to complete the controlled shutdown successfullyNumber of retries to complete the controlled shutdown successfully before executing an unclean shutdown.
controlled.shutdown.retry.backoff.ms 5000Backoff time between two retriesBackoff time between shutdown retries.

More details about broker configuration can be found in the scala class kafka.server.KafkaConfig.

+

3.2 Consumer Configs

The essential consumer configurations are the following:
    @@ -285,7 +289,7 @@ group.id - A string that uniquely identifies a set of consumers within the same consumer group + A string that uniquely identifies the group of consumer processes to which this consumer belongs. By setting the same group id multiple processes indicate that they are all part of the same consumer group. zookeeper.connect @@ -314,17 +318,17 @@ fetch.message.max.bytes 1024 * 1024 - The number of byes of messages to attempt to fetch + The number of byes of messages to attempt to fetch for each topic-partition in each fetch request. These bytes will be read into memory for each partition, so this helps control the memory used by the consumer. The fetch request size must be at least as large as the maximum message size the server allows or else it is possible for the producer to send messages larger than the consumer can fetch. auto.commit.enable true - If true, periodically commit to zookeeper the offset of messages already fetched by the consumer + If true, periodically commit to zookeeper the offset of messages already fetched by the consumer. This committed offset will be used when the process fails as the position from which the new consumer will begin. auto.commit.interval.ms 60 * 1000 - The frequency in ms that the consumer offsets are committed to zookeeper + The frequency in ms that the consumer offsets are committed to zookeeper. queued.max.message.chunks @@ -334,12 +338,12 @@ rebalance.max.retries 4 - Max number of retries during rebalance + When a new consumer joins a consumer group the set of consumers attempt to "rebalance" the load to assign partitions to each consumer. If the set of consumers changes while this assignment is taking place the rebalance will fail and retry. This setting controls the maximum number of attempts before giving up. fetch.min.bytes 1 - The minimum amount of data the server should return for a fetch request. If insufficient data is available the request will block + The minimum amount of data the server should return for a fetch request. If insufficient data is available the request will wait for that much data to accumulate before answering the request. fetch.wait.max.ms @@ -349,12 +353,12 @@ rebalance.backoff.ms 2000 - Backoff time between retries during rebalance + Backoff time between retries during rebalance. refresh.leader.backoff.ms 200 - Backoff time to refresh the leader of a partition after it loses the current leader + Backoff time to wait before trying to determine the leader of a partition that has just lost its leader. auto.offset.reset @@ -370,18 +374,18 @@ client.id - ${group.id} - Client id is specified by the kafka consumer client, used to distinguish different clients + group id value + The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request. zookeeper.session.timeout.ms 6000 - Zookeeper session timeout + Zookeeper session timeout. If the consumer fails to heartbeat to zookeeper for this period of time it is considered dead and a rebalance will occur. zookeeper.connection.timeout.ms 6000 - The max time that the client waits to establish a connection to zookeeper + The max time that the client waits while establishing a connection to zookeeper. zookeeper.sync.time.ms @@ -419,35 +423,46 @@ request.required.acks 0 -

    This value controls when the producer receives an acknowledgement from the broker. Typical values are (1) 0, which means that the producer never waits for an acknowledgement from the broker (the same behavior as 0.7); (2) 1, which means that the producer gets an acknowledgement after the leader replica has received the data; (3) -1, which means that the producer gets an acknowledgement after all in-sync replicas have received the data. The first option provides the lowest latency (no network delay), but the worst durability (some data loss when the leader replica fails). The second option provides lower latency (one network round trip) and better durability (few data loss when the leader replica fails). The last option provides low latency (two network round trips) and the best durability (no data loss as long as the number of failed brokers is less the replication factor of the topic).

    +

    This value controls when a produce request is considered completed. Specifically, how many other brokers must have committed the data to their log and acknowledged this to the leader? Typical values are +

      +
    • 0, which means that the producer never waits for an acknowledgement from the broker (the same behavior as 0.7). This option provides the lowest latency but the weakest durability guarantees (some data will be lost when a server fails). +
    • 1, which means that the producer gets an acknowledgement after the leader replica has received the data. This option provides better durability as the client waits until the server acknowledges the request as successful (only messages that were written to the now-dead leader but not yet replicated will be lost). +
    • -1, which means that the producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the best durability, we guarantee that no messages will be lost as long as at least one in sync replica remains. +
    +

    + request.timeout.ms + 1500 + The amount of time the broker will wait trying to meet the request.required.acks requirement before sending back an error to the client. + + producer.type sync -

    This parameter specifies whether the messages are sent asynchronously or not. Valid values are (1) async for asynchronous send and (2) sync for synchronous send.

    +

    This parameter specifies whether the messages are sent asynchronously in a background thread. Valid values are (1) async for asynchronous send and (2) sync for synchronous send. By setting the producer to async we allow batching together of requests (which is great for throughput) but open the possibility of a failure of the client machine dropping unsent data.

    serializer.class - DefaultEncoder + kafka.serializer.DefaultEncoder The serializer class for messages. The default encoder takes a byte[] and returns the same byte[]. key.serializer.class - ${serializer.class} - The serializer class for keys (defaults to the same as for messages) + + The serializer class for keys (defaults to the same as for messages if nothing is given). partitioner.class - DefaultPartitioner + kafka.producer.DefaultPartitioner The partitioner class for partitioning messages amongst sub-topics. The default partitioner is based on the hash of the key. compression.codec none -

    This parameter allows you to specify the compression codec for all data generated by this producer. Valid values are none, gzip and snappy.

    +

    This parameter allows you to specify the compression codec for all data generated by this producer. Valid values are "none", "gzip" and "snappy".

    @@ -461,14 +476,14 @@ message.send.max.retries 3 -

    The leader may be unavailable transiently, which can fail the sending of a message. This property specifies the number of retries when such failures occur.

    +

    This property will cause the producer to automatically retry a failed send request. This property specifies the number of retries when such failures occur. Note that setting a non-zero value here can lead to duplicates in the case of network errors that cause a message to be sent but the acknowledgement to be lost.

    retry.backoff.ms 100 -

    Before each retry, the producer refreshes the metadata of relevant topics. Since leader election takes a bit of time, this property specifies the amount of time that the producer waits before refreshing the metadata.

    +

    Before each retry, the producer refreshes the metadata of relevant topics to see if a new leader has been elected. Since leader election takes a bit of time, this property specifies the amount of time that the producer waits before refreshing the metadata.

    @@ -481,24 +496,24 @@ queue.buffering.max.ms 5000 - Maximum time, in milliseconds, for buffering data on the producer queue + Maximum time to buffer data when using async mode. For example a setting of 100 will try to batch together 100ms of messages to send at once. This will improve throughput but adds message delivery latency due to the buffering. queue.buffering.max.messages 10000 - The maximum size of the blocking queue for buffering on the producer + The maximum number of unsent messages that can be queued up the producer when using async mode before either the producer must be blocked or data must be dropped. queue.enqueue.timeout.ms -1 -

    Timeout for event enqueue:
    * 0: events will be enqueued immediately or dropped if the queue is full
    * -ve: enqueue will block indefinitely if the queue is full
    * +ve: enqueue will block up to this many milliseconds if the queue is full

    +

    The amount of time to block before dropping messages when running in async mode and the buffer has reached queue.buffering.max.messages. If set to 0 events will be enqueued immediately or dropped if the queue is full (the producer send call will never block). If set to -1 the producer will block indefinitely and never willingly drop a send.

    batch.num.messages 200 - The number of messages batched at the producer + The number of messages to send in one batch when using async mode. The producer will wait until either this number of messages are ready to send or queue.buffer.max.ms is reached. send.buffer.bytes @@ -508,12 +523,7 @@ client.id "" - The client application sending the producer requests + The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request. - - request.timeout.ms - 1500 - The ack timeout of the producer requests. Value must be non-negative and non-zero -

    More details about producer configuration can be found in the scala class kafka.producer.ProducerConfig.

    --089e013d1b9ccde6ac04e50f11d0--