kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neha Narkhede (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-513) Add state change log to Kafka brokers
Date Tue, 05 Mar 2013 01:06:11 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13592884#comment-13592884

Neha Narkhede commented on KAFKA-513:

Thanks for the patch! This patch is great and is very close to being checked in. Few review
suggestions -

1. ControllerChannelManager
1.1 The log message when the controller sends the data includes the controller's epoch, but
the response doesn't. It is useful to know the epoch wherever the controller id is logged.
1.2 The "," between controller id and epoch probably could be skipped, this pattern is not
followed elsewhere in this patch.

2. Partition
2.1 The error statement when the broker is dropping the leaderAndIsr request should be -

"since local leader epoch %d is >= the request's leader epoch"

2.2 Also, let's add "Broker %d aborted..." to the following statement similar to "Broker %d

"Aborted the become-follower state change since leader %d for partition [%s,%d]"

3. PartitionStateMachine
In electLeaderForPartition() API, it is useful to include the contents of the StateChangedFailedException
in the state change log. This is because it is useful to know that after starting the state
change, it got aborted because another controller with a higher controller epoch was detected.
So something like -
"Controller %d epoch %d aborted leader election for partition [%s,%d] since ..."

4. ReplicaManager

The log statements here are of the form of [Replica Manager on Broker %d]: Handling... This
is different from the ones in Partition which are like "Broker %d started become-follower...".
I guess the default logIdent, which is meant for the main log, is the cause of this discrepancy.
For the purpose of the state change log, we actually don't care if it is Replica Manager or
Partition on that broker, it would just suffice to have the simplest statement to communicate
the state changes like the ones in Partition except the logIdent.

5. ReplicaStateMachine
Same as 4

6. Rename stateChangeLogMerger.scala to have 1st letter in caps.

7. StateChangeLogMerger

This tool looks great now. Thanks for including the review suggestions. Few minor observations

7.1 You check if at least one of the two log input options are specified. I think we should
also check that at most one of them is specified. In other words, what's the expected behavior
if both are speci
fied ?

> Add state change log to Kafka brokers
> -------------------------------------
>                 Key: KAFKA-513
>                 URL: https://issues.apache.org/jira/browse/KAFKA-513
>             Project: Kafka
>          Issue Type: Sub-task
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Swapnil Ghike
>            Priority: Blocker
>              Labels: p1, replication, tools
>             Fix For: 0.8
>         Attachments: kafka-513-v1.patch, kafka-513-v2.patch, kafka-513-v3.patch, kafka-513-v4.patch
>   Original Estimate: 96h
>  Remaining Estimate: 96h
> Once KAFKA-499 is checked in, every controller to broker communication can be modelled
as a state change for one or more partitions. Every state change request will carry the controller
epoch. If there is a problem with the state of some partitions, it will be good to have a
tool that can create a timeline of requested and completed state changes. This will require
each broker to output a state change log that has entries like
> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for partition [foo,
0] from controller 2, epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for partition [foo,
0] from controller 2, epoch 1
> On controller, this will look like -
> [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change request LeaderAndIsr()
for partition [foo, 0]
> We need a tool that can collect the state change log from all brokers and create a per-partition
timeline of state changes -
> [foo, 0]
> [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change request LeaderAndIsr()

> [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from controller 2,
epoch 1
> [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from controller 2,
epoch 1
> This JIRA involves adding the state change log to each broker and adding the tool to
create the timeline 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message