From reviews-return-918752-archive-asf-public=cust-asf.ponee.io@spark.apache.org Tue Sep 17 07:31:45 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 064F7180645 for ; Tue, 17 Sep 2019 09:31:44 +0200 (CEST) Received: (qmail 4097 invoked by uid 500); 17 Sep 2019 07:31:44 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 4082 invoked by uid 99); 17 Sep 2019 07:31:44 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Sep 2019 07:31:44 +0000 From: GitBox To: reviews@spark.apache.org Subject: [GitHub] [spark] turboFei commented on a change in pull request #25795: [SPARK-29037][Core] Spark gives duplicate result when an application was killed Message-ID: <156870550419.5343.13537091004229974459.gitbox@gitbox.apache.org> Date: Tue, 17 Sep 2019 07:31:44 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit turboFei commented on a change in pull request #25795: [SPARK-29037][Core] Spark gives duplicate result when an application was killed URL: https://github.com/apache/spark/pull/25795#discussion_r325018791 ########## File path: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala ########## @@ -160,11 +160,15 @@ class HadoopMapReduceCommitProtocol( val taskAttemptContext = new TaskAttemptContextImpl(jobContext.getConfiguration, taskAttemptId) committer = setupCommitter(taskAttemptContext) - committer.setupJob(jobContext) + if (!dynamicPartitionOverwrite) { Review comment: When a job is killed, its staging dir can be cleaned up by `abortJob` method. But when an application is killed, its job's staging dir would not be cleaned up gracefully. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org