kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Swapnil Ghike (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-559) Garbage collect old consumer metadata entries
Date Thu, 04 Jul 2013 07:28:20 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699819#comment-13699819
] 

Swapnil Ghike commented on KAFKA-559:
-------------------------------------

Some feedback:

1. Passing a groupId for cleanup will make the cleanup job tedious since we tend to have hundreds
of console-consumer group ids in ZK that are stale. Running the tool for a particular topic
or all topics probably makes more sense. 

2. I would suggest accepting a date param "mm-dd-yyyy hh:mm:ss,SSS" as a String instead of
accepting a timestamp value, and deleting the group only if it has had no updates to its offsets
since that date, as described above.

3. It's dangerous to delete the entire group if the date/"since" is not provided. It's very
easy for user to specify only two arguments (topic and zkconnect) and not specify the date.
Let's also make sure that the user always specifies a date.

4. "dry-run" does not need to accept any value. You can simply use parser.accepts("dry-run",
"....") and then use if (options.has(dryRunOpt)) { yeay } else { nay }.

5. We can inline exitIfNoPathExists, the implementation is small and clear enough.

6. We should have an info statement when the group ids are deleted in the non dry-run mode.

7. info("Removal has successfully completed.") can probably be refactored to something more
specific to this tool.

8. Instead of writing a different info statement for dry-run mode, I think you should be able
to set logIdent of Logging to "[dry-run]" or "" depending on which mode the tool is working
in. This will let you have a single info statement for both modes. 

Minor stuff:

1. I think we tend to use camelCase in variable names instead of underscores. 
2. Whitespaces can be made more consistent.
                
> Garbage collect old consumer metadata entries
> ---------------------------------------------
>
>                 Key: KAFKA-559
>                 URL: https://issues.apache.org/jira/browse/KAFKA-559
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Jay Kreps
>            Assignee: Tejas Patil
>              Labels: project
>         Attachments: KAFKA-559.v1.patch
>
>
> Many use cases involve tranient consumers. These consumers create entries under their
consumer group in zk and maintain offsets there as well. There is currently no way to delete
these entries. It would be good to have a tool that did something like
>   bin/delete-obsolete-consumer-groups.sh [--topic t1] --since [date] --zookeeper [zk_connect]
> This would scan through consumer group entries and delete any that had no offset update
since the given date.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message