kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <gshap...@cloudera.com>
Subject Re: [SECURITY DISCUSSION] Refactoring Brokers to support multiple ports
Date Fri, 05 Dec 2014 20:48:20 GMT
Hi,

I posted first draft of the change here:
https://issues.apache.org/jira/browse/KAFKA-1809

Its missing a lot of the feedback from Jun (IPv6, upgrade path, could
use more tests probably).
However, it has most of the structure that I had in mind.

So, comments are more than welcome :)

Gwen


On Tue, Dec 2, 2014 at 4:10 PM, Jun Rao <junrao@gmail.com> wrote:
> 1. What would the format of advertised listener looks like? If we have two
> hosts separated by a colon, it may make parsing IP v6 harder.
>
> 3.1 Currently, the only public api that exposes requests/responses is the
> SimpleConsumer. Since most people probably use the high level consumer,
> breaking the api in SimpleConsumer may be ok. Alternatively, we can keep
> Broker as it is, but add sth like BrokerProfile to represent the full
> broker.
>
> 3.2 We haven't made any intra-broker protocol changes yet. The idea is to
> have a config, sth like "use.new.wire.protocol" that defaults to false. In
> phase 1, we do a rolling upgrade of every broker to the new code. In phase
> 2, we set "use.new.wire.protocol" to true and do a rolling bounce of every
> broker again. Yes, we should bump up the protocol version now. As for
> multiple protocol changes within the same release, we can discuss that in
> the mailing list separately. One way that could work is that once a
> protocol change is stable, we can have a discussion in the mailing list and
> declare it stable. If there are new changes after this, we will bump up the
> version. This way, people deploying from trunk will know when it's safe to
> use the new feature.
>
> Thanks,
>
> Jun
>
>
> On Tue, Dec 2, 2014 at 2:56 PM, Gwen Shapira <gshapira@cloudera.com> wrote:
>
>> Thanks you so much for your help here Jun!
>> Highlighting the specific protocols is very useful.
>>
>> See some detailed comments below.
>>
>> On Tue, Dec 2, 2014 at 1:58 PM, Jun Rao <junrao@gmail.com> wrote:
>> > Hi, Gwen,
>> >
>> > Thanks for writing up the wiki. Some comments below.
>> >
>> > 1. To make it more general, should we support a binding and an advertised
>> > host for each protocol (e.g. plaintext, ssl, etc)? We will also need to
>> > figure out how to specify the wildcard binding host.
>>
>> Yes, thats the idea. Two lines of config, one with list of listeners
>> (protocol://host:port) and one with list of advertised listeners.
>> Advertised listeners are optional. I think wildcard binding is
>> normally done with 0.0.0.0 host (at least for HDFS), so I was planning
>> to keep that convention.
>>
>> >
>> > 2. Broker format change in ZK
>> > The broker registration in ZK needs to store the host/port for all
>> > protocols. We will need to bump up the version of the broker registration
>> > data. Since this is an intra-cluster protocol change, we need an extra
>> > config for rolling upgrades. So, in the first step, each broker is
>> upgraded
>> > and is ready to parse brokers registered in the new format, but not
>> > registering using the new format yet. In the second step, when that new
>> > config is enabled, the broker will register using the new format.
>> >
>>
>> I'm not sure this is necessary in this case. We'll bump the version for
>> sure.
>> And as long as the new format contains all the fields of the previous
>> formats, the JSON de-serialization should work and just ignore the new
>> fields.
>> So the new brokers can register with the new format right away and the
>> old brokers will be able to read that registration with no issues.
>> New brokers will be able to use old registration but will also know
>> about the extra ports and protocols from the additional field.
>>
>> > 3. Wire protocol changes. Currently, the broker info is used in the
>> > following requests/responses: TopicMetadataResponse ,
>> > ConsumerMetadataResponse, LeaderAndIsrRequest  and UpdateMetadataRequest.
>>
>>
>> > 3.1 TopicMetadataResponse and ConsumerMetadataResponse:
>> > These two are used between the clients and the broker. I am not sure that
>> > we need to make a wire protocol change for them. Currently, the protocol
>> > includes a single host/port pair in those responses. Based on the type of
>> > the port on which the request is sent, it seems that we can just pick the
>> > corresponding host and port to include in the response.
>>
>> The wire protocol will not change here, but the Scala API (i.e method
>> signatures for response and request classes) will change from getting
>> brokers (which no longer represent single host+port pair) to getting
>> endpoints (which do).
>>
>> I assumed the Scala API is public, but perhaps I was wrong there.
>>
>>
>> > 3.2 UpdateMetadataRequest:
>> > This is used between the controller and the broker. Since each broker
>> needs
>> > to cache the host/port of all protocols, we need to make a wire protocol
>> > change. We also need to change the broker format in MetadataCache
>> > accordingly. This is also an intra-cluster protocol change. So the
>> upgrade
>> > path will need to follow that in item 2.
>>
>> Yes. Because the wire protocol is byte-array based, the existing
>> brokers will not be able to parse messages from new brokers without
>> the upgrade path you described.
>>
>> Is this something that was done in the past? I'm wondering about few
>> things:
>> * Is the move from phase 1 (read new protocol but send old protocol)
>> to phase 2 (send and receive new protocol) done via a config
>> parameter? Or is there other methods?
>> * When do we actually bump the version? I'm planning to bump it now
>> and hopefully this will be the last bump for 0.9. However, if
>> additional patches make more protocol modifications, do we assume a
>> single version per release? Or do we assume people will want to
>> upgrade between random trunk states?
>>
>>
>> > 3.3 LeaderAndIsrRequest:
>> > This is also used between the controller and the broker. The receiving
>> > broker uses the host/port of the leader replica to send the fetch
>> request.
>> > I am not sure if we need a wire protocol change in this case. I was
>> > imagining that we will just add a new broker config, sth like
>> > replication.socket.protocol. Base on this config, the controller will
>> pick
>> > the right host/port to include in the request.
>>
>> I agree here. It looks feasible.
>>
>> >
>> > 4. Should we plan to support security just on the new java clients?
>> > Supporting security in both the old and the new clients adds more work
>> and
>> > gives people less incentive to migrate off the old clients.
>>
>> I'd love to do that. Lets plan on just new clients.
>> I'll update if any of my customers yells loudly and we'll need to
>> support old clients too :)
>>
>> > Thanks,
>> >
>> > Jun
>> >
>> >
>> > On Tue, Nov 25, 2014 at 11:13 AM, Gwen Shapira <gshapira@cloudera.com>
>> > wrote:
>> >
>> >> Hi Everyone,
>> >>
>> >> One of the pre-requisites we have for supporting multiple security
>> >> protocols (SSL, Kerberos) is to support them on separate ports.
>> >>
>> >> This is done in KAFKA-1684 (The SSL Patch), but that patch addresses
>> >> several different issues - Multiple ports, enriching the channels, SSL
>> >> implementation - which makes it more challenging to review and to test.
>> >>
>> >> I'd like to split this into 3 separate patches: multi-port brokers,
>> >> enriching SocketChannel, and  the actual security implementations.
>> >>
>> >> Since even just adding support for multiple listeners per broker is
>> >> somewhat involved and touches multiple components, I wrote a short
>> design
>> >> document that covers the necessary changes and the upgrade process:
>> >>
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/Multiple+Listeners+for+Kafka+Brokers
>> >>
>> >> Comments are more than welcome :)
>> >>
>> >> If this is acceptable, hope to have a patch ready in few days.
>> >>
>> >> Gwen Shapira
>> >>
>>

Mime
View raw message