Return-Path: X-Original-To: apmail-kafka-dev-archive@www.apache.org Delivered-To: apmail-kafka-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 43A9ADAFF for ; Fri, 1 Mar 2013 03:53:17 +0000 (UTC) Received: (qmail 66073 invoked by uid 500); 1 Mar 2013 03:53:14 -0000 Delivered-To: apmail-kafka-dev-archive@kafka.apache.org Received: (qmail 66001 invoked by uid 500); 1 Mar 2013 03:53:14 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 65940 invoked by uid 99); 1 Mar 2013 03:53:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 03:53:12 +0000 Date: Fri, 1 Mar 2013 03:53:12 +0000 (UTC) From: "Swapnil Ghike (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KAFKA-513) Add state change log to Kafka brokers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590216#comment-13590216 ] Swapnil Ghike commented on KAFKA-513: ------------------------------------- There are many changes, the non-tool part of the patch is not complete yet. I will first make some comments on the tool that I have written up and give some examples below. Let me know if you want to make any changes to the tool's input/output: 8.1 Yes 8.2 Using dash now. 8.3 If we provide a directory as input, then the question would be whether 1. the user needs to create this directory 2. the tool looks for certain file names I think we could avoid these issues by providing two options: 1. comma separated list of log files 2. log file name regex (the user would be able to specify full path of the file name regex). Examples below. 8.4 Ok. The merge tool will look in a time interval closed at both the start time and end time specified. 8.5 Cool. The merge tool accepts exactly one topic in arguments. i. If only the topic is specified, the tool will look for log entries of all its partitions. ii. The tool will also accept a comma separated list of partitions. When this list is specified, the topic must be specified, and the tool will look for only those entries for topic and partitions specified. iii. If no topic is specified in arguments, the merge tool will merge log entries of all topics. 8.6 i. Implemented n-way merge with output buffer of size 1MB. ii. The tool merges logs while maintaining chronological order. So if the tool is used to merge logs of multiple partitions or multiple topics, the output will only be ordered wrt time, but will be jumbled up wrt partitions or topics. As discussed offline, in this case, the user is expected to perform additional screening of the output. iii. Of course, if the user specifies only one topic and one partition, the output will not appear jumbled. Exmaples: sghike@sghike-ld:~/kafka/kafka/bin$ ./kafka-run-class.sh kafka.tools.stateChangeLogMerger --logs-regex ../state-change.log* --topic testfoo --partitions 0 --start-time "2013-02-28 13:41:35,891" --end-time "2013-02-28 13:41:35,959" [2013-02-28 13:41:35,891] INFO [Partition state machine on Controller 0]: Elected leader 0 for Offline partition [testfoo, 0] (partitionStateChangeLogger) [2013-02-28 13:41:35,892] INFO [Partition state machine on Controller 0]: Partition [testfoo, 0] state changed from OnlinePartition to OnlinePartition with leader 0 (partitionStateChangeLogger) [2013-02-28 13:41:35,900] DEBUG Controller 0, epoch 9 sending LeaderAndIsr request with correlationId 1to broker 0 for partition [testfoo,0] (partitionStateChangeLogger) [2013-02-28 13:41:35,923] INFO [Replica state machine on Controller 0]: Replica 0 for partition [testfoo, 0] state changed to OnlineReplica (partitionStateChangeLogger) [2013-02-28 13:41:35,924] DEBUG Controller 0, epoch 9 sending LeaderAndIsr request with correlationId 2to broker 0 for partition [testfoo,0] (partitionStateChangeLogger) [2013-02-28 13:41:35,958] INFO [Replica Manager on Broker 0]: Received LeaderAndIsr request from controller 0, epoch 2, starting leader state transition for partition [testfoo, 0] (partitionStateChangeLogger) [2013-02-28 13:41:35,959] INFO [Replica Manager on Broker 0]: Completed leader state transition for partition [testfoo, 0] (partitionStateChangeLogger) sghike@sghike-ld:~/kafka/kafka/bin$ ./kafka-run-class.sh kafka.tools.stateChangeLogMerger Provide at least one of the two arguments "[logs]" or "[logs-regex]" sghike@sghike-ld:~/kafka/kafka/bin$ ./kafka-run-class.sh kafka.tools.stateChangeLogMerger --logs ../state-change.log --partitions 0,1 "[topic]" required with partition ids > Add state change log to Kafka brokers > ------------------------------------- > > Key: KAFKA-513 > URL: https://issues.apache.org/jira/browse/KAFKA-513 > Project: Kafka > Issue Type: Sub-task > Affects Versions: 0.8 > Reporter: Neha Narkhede > Assignee: Swapnil Ghike > Priority: Blocker > Labels: p1, replication, tools > Fix For: 0.8 > > Attachments: kafka-513-v1.patch, kafka-513-v2.patch, kafka-513-v3.patch > > Original Estimate: 96h > Remaining Estimate: 96h > > Once KAFKA-499 is checked in, every controller to broker communication can be modelled as a state change for one or more partitions. Every state change request will carry the controller epoch. If there is a problem with the state of some partitions, it will be good to have a tool that can create a timeline of requested and completed state changes. This will require each broker to output a state change log that has entries like > [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for partition [foo, 0] from controller 2, epoch 1 > [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for partition [foo, 0] from controller 2, epoch 1 > On controller, this will look like - > [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change request LeaderAndIsr() for partition [foo, 0] > We need a tool that can collect the state change log from all brokers and create a per-partition timeline of state changes - > [foo, 0] > [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change request LeaderAndIsr() > [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from controller 2, epoch 1 > [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from controller 2, epoch 1 > This JIRA involves adding the state change log to each broker and adding the tool to create the timeline -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira