kafka-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jun...@apache.org
Subject svn commit: r1607405 - /kafka/site/081/design.html
Date Wed, 02 Jul 2014 15:59:50 GMT
Author: junrao
Date: Wed Jul  2 15:59:50 2014
New Revision: 1607405

URL: http://svn.apache.org/r1607405
Log:
fix broken link and typo

Modified:
    kafka/site/081/design.html

Modified: kafka/site/081/design.html
URL: http://svn.apache.org/viewvc/kafka/site/081/design.html?rev=1607405&r1=1607404&r2=1607405&view=diff
==============================================================================
--- kafka/site/081/design.html (original)
+++ kafka/site/081/design.html Wed Jul  2 15:59:50 2014
@@ -97,7 +97,7 @@ The client controls which partition it p
 <p>
 Batching is one of the big drivers of efficiency, and to enable batching the Kafka producer
has an asynchronous mode that accumulates data in memory and sends out larger batches in a
single request. The batching can be configured to accumulate no more than a fixed number of
messages and to wait no longer than some fixed latency bound (say 100 messages or 5 seconds).
This allows the accumulation of more bytes to send, and few larger I/O operations on the servers.
Since this buffering happens in the client it obviously reduces the durability as any data
buffered in memory and not yet sent will be lost in the event of a producer crash.
 <p>
-Note that as of Kafka 0.8.1 the async producer does not have a callback, which could be used
to register handlers to catch send errors.  Adding such callback functionality is proposed
for Kafka 0.9, see [Proposed Producer API](https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ProposedProducerAPI).
+Note that as of Kafka 0.8.1 the async producer does not have a callback, which could be used
to register handlers to catch send errors.  Adding such callback functionality is proposed
for Kafka 0.9, see <a href="https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite#ClientRewrite-ProposedProducerAPI">Proposed
Producer API</a>.
 
 <h3><a id="theconsumer">4.5 The Consumer</a></h3>
 
@@ -155,7 +155,7 @@ These are not the strongest possible sem
 <p>
 Not all use cases require such strong guarantees. For uses which are latency sensitive we
allow the producer to specify the durability level it desires. If the producer specifies that
it wants to wait on the message being committed this can take on the order of 10 ms. However
the producer can also specify that it wants to perform the send completely asynchronously
or that it wants to wait only until the leader (but not necessarily the followers) have the
message.
 <p>
-Now let's describe the semantics from the point-of-view of the consumer. All replicas have
the exact same log with the same offsets. The consumer controls its position in this log.
If the consumer never crashed it could just store this position in memory, but if the producer
fails and we want this topic partition to be taken over by another process the new process
will need to choose an appropriate position from which to start processing. Let's say the
consumer reads some messages -- it has several options for processing the messages and updating
its position.
+Now let's describe the semantics from the point-of-view of the consumer. All replicas have
the exact same log with the same offsets. The consumer controls its position in this log.
If the consumer never crashed it could just store this position in memory, but if the consumer
fails and we want this topic partition to be taken over by another process the new process
will need to choose an appropriate position from which to start processing. Let's say the
consumer reads some messages -- it has several options for processing the messages and updating
its position.
 <ol>
   <li>It can read the messages, then save its position in the log, and finally process
the messages. In this case there is a possibility that the consumer process crashes after
saving its position but before saving the output of its message processing. In this case the
process that took over processing would start at the saved position even though a few messages
prior to that position had not been processed. This corresponds to "at-most-once" semantics
as in the case of a consumer failure messages may not be processed.
   <li>It can read the messages, process the messages, and finally save its position.
In this case there is a possibility that the consumer process crashes after processing messages
but before saving its position. In this case when the new process takes over the first few
messages it receives will already have been processed. This corresponds to the "at-least-once"
semantics in the case of consumer failure. In many cases messages have a primary key and so
the updates are idempotent (receiving the same message twice just overwrites a record with
another copy of itself).



Mime
View raw message