kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <gshap...@cloudera.com>
Subject Re: [SECURITY DISCUSSION] Refactoring Brokers to support multiple ports
Date Tue, 02 Dec 2014 22:56:12 GMT
Thanks you so much for your help here Jun!
Highlighting the specific protocols is very useful.

See some detailed comments below.

On Tue, Dec 2, 2014 at 1:58 PM, Jun Rao <junrao@gmail.com> wrote:
> Hi, Gwen,
> Thanks for writing up the wiki. Some comments below.
> 1. To make it more general, should we support a binding and an advertised
> host for each protocol (e.g. plaintext, ssl, etc)? We will also need to
> figure out how to specify the wildcard binding host.

Yes, thats the idea. Two lines of config, one with list of listeners
(protocol://host:port) and one with list of advertised listeners.
Advertised listeners are optional. I think wildcard binding is
normally done with host (at least for HDFS), so I was planning
to keep that convention.

> 2. Broker format change in ZK
> The broker registration in ZK needs to store the host/port for all
> protocols. We will need to bump up the version of the broker registration
> data. Since this is an intra-cluster protocol change, we need an extra
> config for rolling upgrades. So, in the first step, each broker is upgraded
> and is ready to parse brokers registered in the new format, but not
> registering using the new format yet. In the second step, when that new
> config is enabled, the broker will register using the new format.

I'm not sure this is necessary in this case. We'll bump the version for sure.
And as long as the new format contains all the fields of the previous
formats, the JSON de-serialization should work and just ignore the new
So the new brokers can register with the new format right away and the
old brokers will be able to read that registration with no issues.
New brokers will be able to use old registration but will also know
about the extra ports and protocols from the additional field.

> 3. Wire protocol changes. Currently, the broker info is used in the
> following requests/responses: TopicMetadataResponse ,
> ConsumerMetadataResponse, LeaderAndIsrRequest  and UpdateMetadataRequest.

> 3.1 TopicMetadataResponse and ConsumerMetadataResponse:
> These two are used between the clients and the broker. I am not sure that
> we need to make a wire protocol change for them. Currently, the protocol
> includes a single host/port pair in those responses. Based on the type of
> the port on which the request is sent, it seems that we can just pick the
> corresponding host and port to include in the response.

The wire protocol will not change here, but the Scala API (i.e method
signatures for response and request classes) will change from getting
brokers (which no longer represent single host+port pair) to getting
endpoints (which do).

I assumed the Scala API is public, but perhaps I was wrong there.

> 3.2 UpdateMetadataRequest:
> This is used between the controller and the broker. Since each broker needs
> to cache the host/port of all protocols, we need to make a wire protocol
> change. We also need to change the broker format in MetadataCache
> accordingly. This is also an intra-cluster protocol change. So the upgrade
> path will need to follow that in item 2.

Yes. Because the wire protocol is byte-array based, the existing
brokers will not be able to parse messages from new brokers without
the upgrade path you described.

Is this something that was done in the past? I'm wondering about few things:
* Is the move from phase 1 (read new protocol but send old protocol)
to phase 2 (send and receive new protocol) done via a config
parameter? Or is there other methods?
* When do we actually bump the version? I'm planning to bump it now
and hopefully this will be the last bump for 0.9. However, if
additional patches make more protocol modifications, do we assume a
single version per release? Or do we assume people will want to
upgrade between random trunk states?

> 3.3 LeaderAndIsrRequest:
> This is also used between the controller and the broker. The receiving
> broker uses the host/port of the leader replica to send the fetch request.
> I am not sure if we need a wire protocol change in this case. I was
> imagining that we will just add a new broker config, sth like
> replication.socket.protocol. Base on this config, the controller will pick
> the right host/port to include in the request.

I agree here. It looks feasible.

> 4. Should we plan to support security just on the new java clients?
> Supporting security in both the old and the new clients adds more work and
> gives people less incentive to migrate off the old clients.

I'd love to do that. Lets plan on just new clients.
I'll update if any of my customers yells loudly and we'll need to
support old clients too :)

> Thanks,
> Jun
> On Tue, Nov 25, 2014 at 11:13 AM, Gwen Shapira <gshapira@cloudera.com>
> wrote:
>> Hi Everyone,
>> One of the pre-requisites we have for supporting multiple security
>> protocols (SSL, Kerberos) is to support them on separate ports.
>> This is done in KAFKA-1684 (The SSL Patch), but that patch addresses
>> several different issues - Multiple ports, enriching the channels, SSL
>> implementation - which makes it more challenging to review and to test.
>> I'd like to split this into 3 separate patches: multi-port brokers,
>> enriching SocketChannel, and  the actual security implementations.
>> Since even just adding support for multiple listeners per broker is
>> somewhat involved and touches multiple components, I wrote a short design
>> document that covers the necessary changes and the upgrade process:
>> https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers
>> Comments are more than welcome :)
>> If this is acceptable, hope to have a patch ready in few days.
>> Gwen Shapira

View raw message