Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ED586200D27 for ; Wed, 25 Oct 2017 18:38:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id EBC10160BDA; Wed, 25 Oct 2017 16:38:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 167F41609CE for ; Wed, 25 Oct 2017 18:38:04 +0200 (CEST) Received: (qmail 13495 invoked by uid 500); 25 Oct 2017 16:38:04 -0000 Mailing-List: contact jira-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@kafka.apache.org Delivered-To: mailing list jira@kafka.apache.org Received: (qmail 13484 invoked by uid 99); 25 Oct 2017 16:38:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Oct 2017 16:38:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 74E651807EA for ; Wed, 25 Oct 2017 16:38:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id mndsI0HGE487 for ; Wed, 25 Oct 2017 16:38:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 1FACE5FC1C for ; Wed, 25 Oct 2017 16:38:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 91A97E0E56 for ; Wed, 25 Oct 2017 16:38:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4B45A212FB for ; Wed, 25 Oct 2017 16:38:00 +0000 (UTC) Date: Wed, 25 Oct 2017 16:38:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: jira@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KAFKA-6075) Kafka cannot recover after an unclean shutdown on Windows MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 25 Oct 2017 16:38:06 -0000 [ https://issues.apache.org/jira/browse/KAFKA-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219019#comment-16219019 ] ASF GitHub Bot commented on KAFKA-6075: --------------------------------------- GitHub user tedyu opened a pull request: https://github.com/apache/kafka/pull/4134 KAFKA-6075 Kafka cannot recover after an unclean shutdown on Windows As Vahid commented, Files.deleteIfExists(file.toPath) seems to destabilize Windows environment. This PR reverts to calling delete() directly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tedyu/kafka trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4134.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4134 ---- commit c734471f496b2bace8359f9c899cca73e636aa8d Author: tedyu Date: 2017-10-25T16:28:18Z KAFKA-6075 Kafka cannot recover after an unclean shutdown on Windows commit 7355e8c282b1cb7d70f2c290702b5b216f28d3cd Author: tedyu Date: 2017-10-25T16:36:50Z Use delete() ---- > Kafka cannot recover after an unclean shutdown on Windows > --------------------------------------------------------- > > Key: KAFKA-6075 > URL: https://issues.apache.org/jira/browse/KAFKA-6075 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.11.0.1 > Reporter: Vahid Hashemian > > An unclean shutdown of broker on Windows cannot be recovered by Kafka. Steps to reproduce from a fresh build: > # Start zookeeper > # Start a broker > # Create a topic {{test}} > # Do an unclean shutdown of broker (find the process id by {{wmic process where "caption = 'java.exe' and commandline like '%server.properties%'" get processid}}), then kill the process by {{taskkill /pid #### /f}} > # Start the broker again > This leads to the following errors: > {code} > [2017-10-17 17:13:24,819] ERROR Error while loading log dir C:\tmp\kafka-logs (kafka.log.LogManager) > java.nio.file.FileSystemException: C:\tmp\kafka-logs\test-0\00000000000000000000.timeindex: The process cannot access the file because it is being used by another process. > at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) > at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) > at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108) > at java.nio.file.Files.deleteIfExists(Files.java:1165) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295) > at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.Log.loadSegmentFiles(Log.scala:295) > at kafka.log.Log.loadSegments(Log.scala:404) > at kafka.log.Log.(Log.scala:201) > at kafka.log.Log$.apply(Log.scala:1729) > at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221) > at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-10-17 17:13:24,819] ERROR Error while deleting the clean shutdown file in dir C:\tmp\kafka-logs (kafka.server.LogDirFailureChannel) > java.nio.file.FileSystemException: C:\tmp\kafka-logs\test-0\00000000000000000000.timeindex: The process cannot access the file because it is being used by another process. > at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) > at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) > at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) > at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) > at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108) > at java.nio.file.Files.deleteIfExists(Files.java:1165) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:333) > at kafka.log.Log$$anonfun$loadSegmentFiles$3.apply(Log.scala:295) > at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at kafka.log.Log.loadSegmentFiles(Log.scala:295) > at kafka.log.Log.loadSegments(Log.scala:404) > at kafka.log.Log.(Log.scala:201) > at kafka.log.Log$.apply(Log.scala:1729) > at kafka.log.LogManager.kafka$log$LogManager$$loadLog(LogManager.scala:221) > at kafka.log.LogManager$$anonfun$loadLogs$2$$anonfun$8$$anonfun$apply$16$$anonfun$apply$2.apply$mcV$sp(LogManager.scala:292) > at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:61) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-10-17 17:13:24,819] INFO Logs loading complete in 47 ms. (kafka.log.LogManager) > [2017-10-17 17:13:24,865] WARN Error processing kafka.log:type=LogManager,name=LogDirectoryOffline,logDirectory=C:\tmp\kafka-logs (com.yammer.metrics.reporting.JmxReporter) > javax.management.MalformedObjectNameException: Invalid character ':' in value part of property > at javax.management.ObjectName.construct(ObjectName.java:618) > at javax.management.ObjectName.(ObjectName.java:1382) > at com.yammer.metrics.reporting.JmxReporter.onMetricAdded(JmxReporter.java:395) > at com.yammer.metrics.core.MetricsRegistry.notifyMetricAdded(MetricsRegistry.java:516) > at com.yammer.metrics.core.MetricsRegistry.getOrAdd(MetricsRegistry.java:491) > at com.yammer.metrics.core.MetricsRegistry.newGauge(MetricsRegistry.java:79) > at kafka.metrics.KafkaMetricsGroup$class.newGauge(KafkaMetricsGroup.scala:80) > at kafka.log.LogManager.newGauge(LogManager.scala:50) > at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:117) > at kafka.log.LogManager$$anonfun$6.apply(LogManager.scala:116) > at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at kafka.log.LogManager.(LogManager.scala:116) > at kafka.log.LogManager$.apply(LogManager.scala:799) > at kafka.server.KafkaServer.startup(KafkaServer.scala:222) > at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38) > at kafka.Kafka$.main(Kafka.scala:92) > at kafka.Kafka.main(Kafka.scala) > [2017-10-17 17:13:24,865] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager) > [2017-10-17 17:13:24,881] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager) > [2017-10-17 17:13:25,131] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.Acceptor) > [2017-10-17 17:13:25,147] INFO [SocketServer brokerId=0] Started 1 acceptor threads (kafka.network.SocketServer) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [ExpirationReaper-0-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2017-10-17 17:13:25,162] INFO [LogDirFailureHandler]: Starting (kafka.server.ReplicaManager$LogDirFailureHandler) > [2017-10-17 17:13:25,162] INFO [ReplicaManager broker=0] Stopping serving replicas in dir C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,162] INFO [ReplicaManager broker=0] Partitions are offline due to failure on log directory C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,162] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions (kafka.server.ReplicaFetcherManager) > [2017-10-17 17:13:25,178] INFO [ReplicaManager broker=0] Broker 0 stopped fetcher for partitions because they are in the failed log dir C:\tmp\kafka-logs (kafka.server.ReplicaManager) > [2017-10-17 17:13:25,178] INFO Stopping serving logs in dir C:\tmp\kafka-logs (kafka.log.LogManager) > [2017-10-17 17:13:25,178] FATAL Shutdown broker because all log dirs in C:\tmp\kafka-logs have failed (kafka.log.LogManager) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)