Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 915E1200D09 for ; Tue, 12 Sep 2017 08:51:45 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8FFFA1609C7; Tue, 12 Sep 2017 06:51:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D967F1609C6 for ; Tue, 12 Sep 2017 08:51:44 +0200 (CEST) Received: (qmail 75783 invoked by uid 500); 12 Sep 2017 06:51:43 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 75772 invoked by uid 99); 12 Sep 2017 06:51:43 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Sep 2017 06:51:43 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 590A4E7E0B; Tue, 12 Sep 2017 06:51:43 +0000 (UTC) From: awarrior To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org References: In-Reply-To: Subject: [GitHub] spark issue #19118: [SPARK-21882][CORE] OutputMetrics doesn't count written ... Content-Type: text/plain Message-Id: <20170912065143.590A4E7E0B@git1-us-west.apache.org> Date: Tue, 12 Sep 2017 06:51:43 +0000 (UTC) archived-at: Tue, 12 Sep 2017 06:51:45 -0000 Github user awarrior commented on the issue: https://github.com/apache/spark/pull/19118 @jiangxb1987 well, I passed that part above but met other initialization chances before runJob. They are in the write function of SparkHadoopWriter. > // Assert the output format/key/value class is set in JobConf. config.assertConf(jobContext, rdd.conf) //// <= chance val committer = config.createCommitter(stageId) committer.setupJob(jobContext) //// <= chance // Try to write all RDD partitions as a Hadoop OutputFormat. try { val ret = sparkContext.runJob(rdd, (context: TaskContext, iter: Iterator[(K, V)]) => { executeTask( context = context, config = config, jobTrackerId = jobTrackerId, sparkStageId = context.stageId, sparkPartitionId = context.partitionId, sparkAttemptNumber = context.attemptNumber, committer = committer, iterator = iter) }) One trace list: > java.lang.Thread.State: RUNNABLE at org.apache.hadoop.fs.FileSystem.getStatistics(FileSystem.java:3270) - locked <0x126a> (a java.lang.Class) at org.apache.hadoop.fs.FileSystem.initialize(FileSystem.java:202) at org.apache.hadoop.fs.RawLocalFileSystem.initialize(RawLocalFileSystem.java:92) at org.apache.hadoop.fs.LocalFileSystem.initialize(LocalFileSystem.java:47) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2598) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2632) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2614) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.(FileOutputCommitter.java:91) at org.apache.hadoop.mapred.FileOutputCommitter.getWrapped(FileOutputCommitter.java:65) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:233) at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:125) at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:74) --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org