kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ismael Juma <ism...@juma.me.uk>
Subject Re: [DISCUSS] KIP-113: Support replicas movement between log directories
Date Mon, 07 Aug 2017 12:28:48 GMT
Hi Dong,

Thanks for the explanation. Comments inline.

On Fri, Aug 4, 2017 at 6:47 PM, Dong Lin <lindong28@gmail.com> wrote:

> 1. Yes it has been considered. Here are the reasons why we don't do it
> through controller.
> - There can be use-cases where we only want to rebalance the load of log
> directories on a given broker. It seems unnecessary to go through
> controller in this case.

Even though this is true, not sure how common it will be.

 - If controller is responsible for sending ChangeReplicaDirRequest, and if
> the user-specified log directory is either invalid or offline, then
> controller probably needs a way to tell user that the partition
> reassignment has failed. We currently don't have a way to do this since
> kafka-reassign-partition.sh simply creates the reassignment znode without
> waiting for response. I am not sure that is a good solution to this.

Since the JSON is provided by the user, we would ideally validate its
contents before storing it. Why are the log directories different than the
other information in the JSON?

- If controller is responsible for sending ChangeReplicaDirRequest, the
> controller logic would be more complicated because controller needs to
> first send ChangeReplicaRequest so that the broker memorize the partition
> -> log directory mapping, send LeaderAndIsrRequest, and keep sending
> ChangeReplicaDirRequest (just in case broker restarted) until replica is
> created. Note that the last step needs repeat and timeout as the proposed
> in the KIP-113.
> Overall I think this adds quite a bit complexity to controller and we
> probably want to do this only if there is strong clear of doing so.
> Currently in KIP-113 the kafka-reassign-partitions.sh is responsible for
> sending ChangeReplicaDirRequest with repeat and provides error to user if
> it either fails or timeout. It seems to be much simpler and user shouldn't
> care whether it is done through controller.

If I understand correctly, the logic is the same in both cases, it's just a
question of where it lives. The advantage of placing it in the Controller
is that the whole reassignment logic is in one place (instead of split
between the tool and the Controller). That seems easier to reason about.

Also, how do you think things would work in the context of KIP-179? Would
the tool still invoke these requests or would it be done by the broker
receiving the alterTopics/reassignPartitions protocol call?

And thanks for the suggestion. I will add this to the Rejected Alternative
> Section in the KIP-113.
> 2) I think user needs to be able to specify different log directories for
> the replicas of the same partition in order to rebalance load across log
> directories of all brokers. I am not sure I understand the question. Can
> you explain a bit more why "that the log directory has to be the same for
> all replicas of a given partition"?

I think I misunderstood the schema. The KIP has the following example:

"partitions" : [
      "topic" : str,
      "partition" : int,
      "replicas" : [int],
      "log_dirs" : [str]    <-- NEW. A log directory can be either "any",
or a valid absolute path that begins with '/'. This is an optional filed.
It is treated as an array of "any" if this field is not explicitly
specified in the json file.

Is it right that `log_dirs` is an array in the same order as `replicas`?
That's a bit obscure and we should document it more clearly. Did we discard
the option of a more readable schema (i.e. a JSON object mapping a replica
id to a log dir) due to efficiency concerns? It would be good to include
that in the KIP.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message