kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Onur Karaman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5099) Replica Deletion Regression from KIP-101
Date Sat, 06 May 2017 03:39:04 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999250#comment-15999250
] 

Onur Karaman commented on KAFKA-5099:
-------------------------------------

I ran the file descriptor experiment against both my patch as well as against git hash b611cfa5c0f4941a491781424afd9b699bdb894e
(the commit before KIP-101: 0baea2ac13532981f3fea11e5dfc6da5aafaeaa8) several times now.

I've now witnessed file descriptor reclamation for my patch take as little as 4 minutes and
as much as unbounded (I eventually gave up waiting after 20 minutes). I've only ever witnessed
file descriptor reclamation for b611cfa5c0f4941a491781424afd9b699bdb894e be unbounded (I eventually
gave up waiting after 20 minutes).

Not sure what to make of this. b611cfa5c0f4941a491781424afd9b699bdb894e is more recent than
the fix for KAFKA-4614 (5fc530bc483db145e0cba3b63a57d6d6a7c547f2).

I'm just going to submit the PR.

> Replica Deletion Regression from KIP-101
> ----------------------------------------
>
>                 Key: KAFKA-5099
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5099
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Onur Karaman
>            Assignee: Onur Karaman
>            Priority: Blocker
>             Fix For: 0.11.0.0
>
>
> It appears that replica deletion regressed from KIP-101. Replica deletion happens when
a broker receives a StopReplicaRequest with delete=true. Ever since KAFKA-1911, replica deletion
has been async, meaning the broker responds with a StopReplicaResponse simply after marking
the replica directory as staged for deletion. This marking happens by moving a data log directory
and its contents such as /tmp/kafka-logs1/t1-0 to a marked directory like /tmp/kafka-logs1/t1-0.8c9c4c0c61c44cc59ebeb00075a2a07f-delete,
acting as a soft-delete. A scheduled thread later actually deletes the data. It appears that
the regression occurs while the scheduled thread is actually trying to delete the data, which
means the controller considers operations such as partition reassignment and topic deletion
complete. But if you look at the log4j logs and data logs, you'll find that the soft-deleted
data logs haven't actually won't get deleted. It seems that restarting the broker actually
allows for the soft-deleted directories to get deleted.
> Here's the setup:
> {code}
> > ./bin/zookeeper-server-start.sh config/zookeeper.properties
> > export LOG_DIR=logs0 && ./bin/kafka-server-start.sh config/server0.properties
> > export LOG_DIR=logs1 && ./bin/kafka-server-start.sh config/server1.properties
> > ./bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic t0 --replica-assignment
1:0
> > ./bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic t1 --replica-assignment
1:0
> > ./bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic t0
> > cat p.txt
> {"partitions":
>  [
>   {"topic": "t1", "partition": 0, "replicas": [0] }
>  ],
> "version":1
> }
> > ./bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file
p.txt --execute
> {code}
> Here are sample logs:
> {code}
> [2017-04-20 17:46:54,801] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for
partitions t0-0 (kafka.server.ReplicaFetcherManager)
> [2017-04-20 17:46:54,814] INFO Log for partition t0-0 is renamed to /tmp/kafka-logs1/t0-0.bbc8fa126e3e4ff787f6b68d158ab771-delete
and is scheduled for deletion (kafka.log.LogManager)
> [2017-04-20 17:47:27,585] INFO Deleting index /tmp/kafka-logs1/t0-0.bbc8fa126e3e4ff787f6b68d158ab771-delete/00000000000000000000.index
(kafka.log.OffsetIndex)
> [2017-04-20 17:47:27,586] INFO Deleting index /tmp/kafka-logs1/t0-0/00000000000000000000.timeindex
(kafka.log.TimeIndex)
> [2017-04-20 17:47:27,587] ERROR Exception in deleting Log(/tmp/kafka-logs1/t0-0.bbc8fa126e3e4ff787f6b68d158ab771-delete).
Moving it to the end of the queue. (kafka.log.LogManager)
> java.io.FileNotFoundException: /tmp/kafka-logs1/t0-0/leader-epoch-checkpoint.tmp (No
such file or directory)
>   at java.io.FileOutputStream.open0(Native Method)
>   at java.io.FileOutputStream.open(FileOutputStream.java:270)
>   at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
>   at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
>   at kafka.server.checkpoints.CheckpointFile.write(CheckpointFile.scala:41)
>   at kafka.server.checkpoints.LeaderEpochCheckpointFile.write(LeaderEpochCheckpointFile.scala:61)
>   at kafka.server.epoch.LeaderEpochFileCache.kafka$server$epoch$LeaderEpochFileCache$$flush(LeaderEpochFileCache.scala:178)
>   at kafka.server.epoch.LeaderEpochFileCache$$anonfun$clear$1.apply$mcV$sp(LeaderEpochFileCache.scala:161)
>   at kafka.server.epoch.LeaderEpochFileCache$$anonfun$clear$1.apply(LeaderEpochFileCache.scala:159)
>   at kafka.server.epoch.LeaderEpochFileCache$$anonfun$clear$1.apply(LeaderEpochFileCache.scala:159)
>   at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
>   at kafka.utils.CoreUtils$.inWriteLock(CoreUtils.scala:221)
>   at kafka.server.epoch.LeaderEpochFileCache.clear(LeaderEpochFileCache.scala:159)
>   at kafka.log.Log.delete(Log.scala:1051)
>   at kafka.log.LogManager.kafka$log$LogManager$$deleteLogs(LogManager.scala:442)
>   at kafka.log.LogManager$$anonfun$startup$5.apply$mcV$sp(LogManager.scala:241)
>   at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:110)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> The same appears on broker 1 for t1-0 after partition reassignment moves the replica
off broker 1.
> Here's what the data logs show:
> {code}
> > l /tmp/kafka-logs*/t*
> /tmp/kafka-logs0/t0-0.166137ca8c28475d8c9fbf6f423a937e-delete:
> total 0
> -rw-r--r--  1 okaraman  wheel   0 Apr 20 17:46 00000000000000000000.timeindex
> -rw-r--r--  1 okaraman  wheel   0 Apr 20 17:46 leader-epoch-checkpoint
> drwxr-xr-x  2 okaraman  wheel  68 Apr 20 17:46 pid-mapping
> /tmp/kafka-logs0/t1-0:
> total 40960
> -rw-r--r--  1 okaraman  wheel  10485760 Apr 20 17:46 00000000000000000000.index
> -rw-r--r--  1 okaraman  wheel         0 Apr 20 17:46 00000000000000000000.log
> -rw-r--r--  1 okaraman  wheel  10485756 Apr 20 17:46 00000000000000000000.timeindex
> -rw-r--r--  1 okaraman  wheel         0 Apr 20 17:46 leader-epoch-checkpoint
> drwxr-xr-x  2 okaraman  wheel        68 Apr 20 17:46 pid-mapping
> /tmp/kafka-logs1/t0-0.bbc8fa126e3e4ff787f6b68d158ab771-delete:
> total 0
> -rw-r--r--  1 okaraman  wheel   0 Apr 20 17:46 00000000000000000000.timeindex
> -rw-r--r--  1 okaraman  wheel   0 Apr 20 17:46 leader-epoch-checkpoint
> drwxr-xr-x  2 okaraman  wheel  68 Apr 20 17:46 pid-mapping
> /tmp/kafka-logs1/t1-0.8c9c4c0c61c44cc59ebeb00075a2a07f-delete:
> total 0
> -rw-r--r--  1 okaraman  wheel   0 Apr 20 17:48 00000000000000000000.timeindex
> -rw-r--r--  1 okaraman  wheel   0 Apr 20 17:46 leader-epoch-checkpoint
> drwxr-xr-x  2 okaraman  wheel  68 Apr 20 17:46 pid-mapping
> {code}
> I ran the above test on the checkin preceding KIP-101 and it worked just fine.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message