kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Cheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-4893) async topic deletion conflicts with max topic length
Date Mon, 19 Mar 2018 22:46:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405558#comment-16405558
] 

James Cheng commented on KAFKA-4893:
------------------------------------

We ran into another error which may be related to this:

{code:java}
[2018-03-17 03:11:39,655] ERROR There was an error in one of the threads during logs loading:
java.lang.IllegalArgumentException: Duplicate log directories found: /redacted/kafka/disk-xvdg/log/mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.foofoofoofoofoofoofoofoofoofoofo-3,
/redacted/kafka/disk-xvdh/log/mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.foofoofoofoofoofoofoofoofoofoofo-3!
(kafka.log.LogManager)
[2018-03-17 03:11:39,664] FATAL [Kafka Server 1], Fatal error during KafkaServer startup.
Prepare to shutdown (kafka.server.KafkaServer)
java.lang.IllegalArgumentException: Duplicate log directories found: /redacted/kafka/disk-xvdg/log/mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.foofoofoofoofoofoofoofoofoofoofo-3,
/redacted/kafka/disk-xvdh/log/mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.mirror.foofoofoofoofoofoofoofoofoofoofo-3!
{code}

We ran into this error upon broker startup, 2 weeks after we recovered from the above scenario.
Our theory (unconfirmed) is that:
1) The situation in this JIRA happened, leaving the topic in a partially deleted state.
2) The topic was recreated, and (somehow) made it past the partition-deleted state. The directory
for the partition was created on a different log.dir.
3) Time passes.
4) Upon startup, kafka validates all log.dirs, and found the same partition exists in different
log.dirs

We didn't have time to verify this theory, but we thought we would leave it here in case someone
else runs into it and has time to look into it.

> async topic deletion conflicts with max topic length
> ----------------------------------------------------
>
>                 Key: KAFKA-4893
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4893
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Onur Karaman
>            Assignee: Vahid Hashemian
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> As per the [documentation|http://kafka.apache.org/documentation/#basic_ops_add_topic],
topics can be only 249 characters long to line up with typical filesystem limitations:
> {quote}
> Each sharded partition log is placed into its own folder under the Kafka log directory.
The name of such folders consists of the topic name, appended by a dash (\-) and the partition
id. Since a typical folder name can not be over 255 characters long, there will be a limitation
on the length of topic names. We assume the number of partitions will not ever be above 100,000.
Therefore, topic names cannot be longer than 249 characters. This leaves just enough room
in the folder name for a dash and a potentially 5 digit long partition id.
> {quote}
> {{kafka.common.Topic.maxNameLength}} is set to 249 and is used during validation.
> This limit ends up not being quite right since topic deletion ends up renaming the directory
to the form {{topic-partition.uniqueId-delete}} as can be seen in {{LogManager.asyncDelete}}:
> {code}
> val dirName = new StringBuilder(removedLog.name)
>                   .append(".")
>                   .append(java.util.UUID.randomUUID.toString.replaceAll("-",""))
>                   .append(Log.DeleteDirSuffix)
>                   .toString()
> {code}
> So the unique id and "-delete" suffix end up hogging some of the characters. Deleting
a long-named topic results in a log message such as the following:
> {code}
> kafka.common.KafkaStorageException: Failed to rename log directory from /tmp/kafka-logs0/000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000-0
to /tmp/kafka-logs0/000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000-0.797bba3fb2464729840f87769243edbb-delete
>   at kafka.log.LogManager.asyncDelete(LogManager.scala:439)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply$mcV$sp(Partition.scala:142)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.cluster.Partition$$anonfun$delete$1.apply(Partition.scala:137)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
>   at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:221)
>   at kafka.cluster.Partition.delete(Partition.scala:137)
>   at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:230)
>   at kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:260)
>   at kafka.server.ReplicaManager$$anonfun$stopReplicas$2.apply(ReplicaManager.scala:259)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:259)
>   at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:174)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:86)
>   at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:64)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The topic after this point still exists but has Leader set to -1 and the controller recognizes
the topic completion as incomplete (the topic znode is still in /admin/delete_topics).
> I don't believe linkedin has any topic name this long but I'm making the ticket in case
anyone runs into this problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message