Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@kafka.apache.org
Date: Thu, 4 Apr 2013 02:23:15 +0000 (UTC)
From: "Scott Carey (JIRA)" <jira@apache.org>
To: dev@kafka.apache.org
Message-ID: <JIRA.12527292.1318685327943.104080.1365042195317@arcas>
In-Reply-To: <JIRA.12527292.1318685327943@arcas>
References: <JIRA.12527292.1318685327943@arcas>
Subject: [jira] [Commented] (KAFKA-156) Messages should not be dropped when
 brokers are unavailable
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/KAFKA-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621665#comment-13621665 ] 

Scott Carey commented on KAFKA-156:
-----------------------------------

To support lossless transmission, the producer protocol will need to change to have a two-phase exchange.

Rather than sending a message batch, and receiving a response with the offset of the first message as described: https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html#AGuideToTheKafkaProtocol-ProduceAPI as the Producer Request and Producer Response, the process would have two exchanges per batch:

Option 1: Broker assigns batch ids as the batches come in.  Drawback: broker must track UUIDs and hold state on them for potentially a long time.

1.  "Batch Prepare": Producer send batch to broker
2.  "Acknowledge Batch Prepare": Broker commits batch to staging area, and assigns a UUID (or similar unique id) to the batch, and returns the UUID to the producer
3.  "Batch Commit": Producer sends message to broker to commit the batch, with the UUID token to identify the batch;
4.  "Acknowledge Batch Commit": Broker commits batch from staging area to topic atomically (or idempotently and non-atomically), and returns an acknowledgement with the offset

If the producer crashes or loses connection between steps 1 and 2 or 2 and 3 (or the network breaks, and restores), it can send the batch again, get a new UIUD, and start over, orphaning the first batch.  The client needs to be able to clear out orphaned batches it created, or they must expire after a long time.
If the producer crashes or has network issues between steps 3 and 4, then upon restore it will attempt step 3 again, which is idempotent and safe.  The broker has to keep the in flight UUIDs and used UUIDs for a while because a client may have some large time lag in recovery between a failed step 3 to 3 exchange, and step 3 and 4 may occur multiple times as a result.

Option 2:  Pre-assigned batch ids.  Benefit:  failure between steps 1 and 3 does not orphan a batch.
0a.  "Request Batch IDs": Producer requests a set of batch ids that are unique for use 
0b.  "Receive Batch IDs": Broker returns UUIDs for use as batch ids later.
1.  "Batch Prepare": Producer send batch to broker with one of the UUIDs.
2.  "Acknowledge Batch Prepare": Broker commits batch to staging area tagged with the UUID
3.  "Batch Commit": Producer sends message to broker to commit the batch, with the UUID token to identify the batch;
4.  "Acknowledge Batch Commit": Broker commits batch from staging area to topic atomically (or idempotently and non-atomically), and returns an acknowledgement with the offset

If the producer crashes or loses connection between steps 1 and 2 or 2 and 3, it can attempt step 3, optimistically assuming the broker got the batch and has been staged.  If it did not and step 3 fails, it can start over with step 1 using the same UUID.
If the producer crashes or loses connection between steps 3 and 4, then upon restore it will attempt step 3 again, which is idempotent and safe.  If step 3 fails it can assume the batch has already been committed.  The broker has to track in flight UUIDs and recently committed UUIDs and the corresponding offsets for a while because a client may have some large time lag in recovery after a failure between steps 3 and 4, and steps 3 and 4 may occur more than once for a given batch.

If it is tolerable to lose or duplicate up to one batch per failure (network, consumer, or broker), none of the above is required, and the described protocol is sufficient.  Since there is always the possibility of message loss if the producer crashes, this may be acceptable, however it would be nice to not worry about data loss due to broker or network failure, and leave the only loss window at the producer side.

With two phase batches, a producer can safely spool data to disk in the event of serious error, and recover, or spool to disk in all conditions prior to sending downstream to the broker.  Each batch from disk would be tagged with a UUID and a log can be kept on what the current state is relative to the four steps above so that recovery can initiate at the right spot and no batches missed or duplicated.
                
> Messages should not be dropped when brokers are unavailable
> -----------------------------------------------------------
>
>                 Key: KAFKA-156
>                 URL: https://issues.apache.org/jira/browse/KAFKA-156
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Sharad Agarwal
>             Fix For: 0.8
>
>
> When none of the broker is available, producer should spool the messages to disk and keep retrying for brokers to come back.
> This will also enable brokers upgrade/maintenance without message loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira