kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Swapnil Ghike (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-813) Minor cleanup in Controller
Date Wed, 20 Mar 2013 01:41:15 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Swapnil Ghike updated KAFKA-813:

    Attachment: kafka-813-v2.patch

Addressing comments: 
1. Renamed the exception to PartitionNoReplicaOnlineException, since 
i. the old name caused confusion with OfflinePartition state of the partition state machine

ii. this exception is thrown when all the replicas assigned to a topic-partition are down,
or there are no replicas assigned to a topic-partition

2. Created a new NoOpLeaderSelector and set it as the default.

1.1. Passing the controllerId and clientId separately to ControllerBrokerRequestBatch
1.2. Renamed KafkaController.toString to clientId

2. Created a new NoOpLeaderSelector, thanks for the great suggestion, and set it as the default.

i. It returns the existing leader and Isr and replica assignment, so the logs will first contain
a warning issued by selectLeader() and then the Zk write will log an error.
ii. I think it is ok to set the no-op leader selector as the default, because we don't need
to pass any leader selector when the target state is not OnlinePartition, and there could
be little confusion in there about why no leader selector was passed. Earlier we had cases
where the target state was OnlinePartition, but it was not clear which leader selector was
used, since we provided OfflinePartitionLeaderSelector as the default to handleStateChanges().

Additional changes:
1. Removed OfflinePartitionRate as discussed offline. Instead, created a new gauge OfflinePartitionsCount,
which would measure the count of partitions whose individual "leaders" are down. The reason
to make this change is that OfflinePartitionsRate would provide information about how many
partitions go Offline in a given time window, but does not give information about how many
partitions don't come back to Online State and stay Offline since the last time window.

Marking the jira as blocker because of the metric change.
> Minor cleanup in Controller
> ---------------------------
>                 Key: KAFKA-813
>                 URL: https://issues.apache.org/jira/browse/KAFKA-813
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Swapnil Ghike
>            Assignee: Swapnil Ghike
>             Fix For: 0.8
>         Attachments: kafka-813-v1.patch, kafka-813-v2.patch
> Before starting work on delete topic support, uploading a patch first to address some
minor hiccups that touch a bunch of files:
> 1. Change PartitionOfflineException to PartitionUnavailableException because in the partition
state machine we mark a partition offline when its leader is down, whereas the PartitionOfflineException
is thrown when all the assigned replicas of the partition are down.
> 2. Change PartitionOfflineRate to UnavailablePartitionRate
> 3. Remove default leader selector from partition state machine's handleStateChange. We
can specify null as default when we don't need to use a leader selector.
> 4. Include controller info in the client id of LeaderAndIsrRequest.
> 5. Rename controllerContext.allleaders to something more meaningful - partitionLeadershipInfo.
> 6. We don't need to put partition in OnlinePartition state in partition state machine
initializeLeaderAndIsrForPartition, the state change occurs in handleStateChange.
> 7. Add todo in handleStateChanges
> 8. Left a comment above ReassignedPartitionLeaderSelector that reassigned replicas are
already in the ISR (this is not true for other leader selectors), renamed the vals in the

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message