Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D7E52EF4E for ; Tue, 29 Jan 2013 18:42:22 +0000 (UTC) Received: (qmail 65853 invoked by uid 500); 29 Jan 2013 18:42:22 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 65824 invoked by uid 500); 29 Jan 2013 18:42:22 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 65814 invoked by uid 99); 29 Jan 2013 18:42:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Jan 2013 18:42:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jay.kreps@gmail.com designates 209.85.223.174 as permitted sender) Received: from [209.85.223.174] (HELO mail-ie0-f174.google.com) (209.85.223.174) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Jan 2013 18:42:16 +0000 Received: by mail-ie0-f174.google.com with SMTP id k10so637153iea.33 for ; Tue, 29 Jan 2013 10:41:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=86odDl0QbbDrWVSYZdB8LGsga/5yU6Y5X9NfJuYYAiE=; b=IWF2HvQH3tebtvTBjxfa5lMubBRXS45nrunZJuFSGofxawrJI2w6tHOSBohwWcSvUK zETTQHQkdiBccWYq0QX+evZ5zMwNMjzTn317GYPDNrfujZlo0oHyOSOBk/wNPM/jSltN YjH+H5x8YYmztuqstKQgsVc/DA0HasR+2Gqlz4c5AonX5tCC0Trp2Cs4Ox9Ptvkbrlrf cR01BxHJ7djmXouxPE+50rnpll0CA2M+bt9B8CkiRNMPGPWbJJKdo7x1HgH9pNqB78D7 +tJERu1Sai2GrSUaXdDqFy5FETjQlnThHqXQh402nYQhz5rsFweHr/cfr2MStJrjp2bB Fppg== MIME-Version: 1.0 X-Received: by 10.42.247.8 with SMTP id ma8mr1335399icb.1.1359484915611; Tue, 29 Jan 2013 10:41:55 -0800 (PST) Received: by 10.231.140.137 with HTTP; Tue, 29 Jan 2013 10:41:55 -0800 (PST) In-Reply-To: References: Date: Tue, 29 Jan 2013 10:41:55 -0800 Message-ID: Subject: Re: 0.8 wire protocol for inter-broker communication From: Jay Kreps To: "dev@kafka.apache.org" Content-Type: multipart/alternative; boundary=90e6ba1efca078fa5404d471bfae X-Virus-Checked: Checked by ClamAV on apache.org --90e6ba1efca078fa5404d471bfae Content-Type: text/plain; charset=ISO-8859-1 I don't think this is actually that hard to handle, you just need a config to enable the new fields: Step 1: Implement optional support for the new field with some option that controls whether it is used Step 2: Push all servers, still using the old format. Step 3: Now enable the new field on servers one at a time. This is a couple of steps but since server pushes are easy that should be fine. If we want to make this easy for upgrades we can have an "enable.0.8.compat.mode=true" flag which enables or disables all these together when we do an official release and document it in the release notes. -Jay On Tue, Jan 29, 2013 at 8:33 AM, Jun Rao wrote: > Hi, > > In 0.8, we added versionId for each type of requests. The plan is that if > we want to evolve a particular request, we can implement the logic in the > broker to support both the old and the new versions. Then, we can upgrade > the server first, followed by the clients. > > However, this approach doesn't quite work for requests used among brokers. > These include all requests sent by the controller (e.g., > LeaderAndIsrRequest) and FetchRequest (used by replica fetchers). If we > want to evolve those requests, we will have to bring down the whole cluster > to do the upgrade (since each broker is both a client and a server). This > of course will make the cluster unavailable. > > So, we need to think about a couple of things. First, what's our strategy > to evolve those inter-broker requests. One thing that I can think of is to > do the upgrade in two passes. In the first pass, we upgrade all brokers > first so that each of them is capable of receiving the new version, but not > able to send the new version (this can be controlled by a config). In the > second pass, we upgrade all brokers again by allowing them to send the new > version. Not sure if this is the best way since this will make upgrade a > bit more complicated. > > Second, we probably need to make another pass of those requests to make > sure that they are in good shape, since any change in the future may not be > easy. For example, in LeaderAndIsr response, should we remove the global > errorcode since we already have an errorcode per partition? Also, for the > FetchRequest used by replica fetcher, currently we assume that the fetch > offset equals to the logEndOffset of the remote replica. If we want to > pipeline those requests, this may not be true. So, we will need a separate > field to represent logEndOffset. > > Thanks, > > Jun > --90e6ba1efca078fa5404d471bfae--