Return-Path: X-Original-To: apmail-spark-commits-archive@minotaur.apache.org Delivered-To: apmail-spark-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 763FA10EE1 for ; Wed, 4 Jun 2014 22:56:46 +0000 (UTC) Received: (qmail 5220 invoked by uid 500); 4 Jun 2014 22:56:46 -0000 Delivered-To: apmail-spark-commits-archive@spark.apache.org Received: (qmail 5184 invoked by uid 500); 4 Jun 2014 22:56:46 -0000 Mailing-List: contact commits-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@spark.apache.org Delivered-To: mailing list commits@spark.apache.org Received: (qmail 5177 invoked by uid 99); 4 Jun 2014 22:56:46 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2014 22:56:46 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 3169D93FE62; Wed, 4 Jun 2014 22:56:46 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: pwendell@apache.org To: commits@spark.apache.org Message-Id: <03bd31115212455fa29d96264ce7bfbe@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: git commit: SPARK-1518: FileLogger: Fix compile against Hadoop trunk Date: Wed, 4 Jun 2014 22:56:46 +0000 (UTC) Repository: spark Updated Branches: refs/heads/branch-1.0 d96794132 -> 3df55cb69 SPARK-1518: FileLogger: Fix compile against Hadoop trunk In Hadoop trunk (currently Hadoop 3.0.0), the deprecated FSDataOutputStream#sync() method has been removed. Instead, we should call FSDataOutputStream#hflush, which does the same thing as the deprecated method used to do. Author: Colin McCabe Closes #898 from cmccabe/SPARK-1518 and squashes the following commits: 752b9d7 [Colin McCabe] FileLogger: Fix compile against Hadoop trunk (cherry picked from commit 1765c8d0ddf6bb5bc3c21f994456eba04c581de4) Signed-off-by: Patrick Wendell Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3df55cb6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3df55cb6 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/3df55cb6 Branch: refs/heads/branch-1.0 Commit: 3df55cb69bffe6a15a5c240d5efec7d0e63517d8 Parents: d967941 Author: Colin McCabe Authored: Wed Jun 4 15:56:29 2014 -0700 Committer: Patrick Wendell Committed: Wed Jun 4 15:56:42 2014 -0700 ---------------------------------------------------------------------- .../scala/org/apache/spark/util/FileLogger.scala | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/3df55cb6/core/src/main/scala/org/apache/spark/util/FileLogger.scala ---------------------------------------------------------------------- diff --git a/core/src/main/scala/org/apache/spark/util/FileLogger.scala b/core/src/main/scala/org/apache/spark/util/FileLogger.scala index 0e6d21b..6a95dc0 100644 --- a/core/src/main/scala/org/apache/spark/util/FileLogger.scala +++ b/core/src/main/scala/org/apache/spark/util/FileLogger.scala @@ -61,6 +61,14 @@ private[spark] class FileLogger( // Only defined if the file system scheme is not local private var hadoopDataStream: Option[FSDataOutputStream] = None + // The Hadoop APIs have changed over time, so we use reflection to figure out + // the correct method to use to flush a hadoop data stream. See SPARK-1518 + // for details. + private val hadoopFlushMethod = { + val cls = classOf[FSDataOutputStream] + scala.util.Try(cls.getMethod("hflush")).getOrElse(cls.getMethod("sync")) + } + private var writer: Option[PrintWriter] = None /** @@ -149,13 +157,13 @@ private[spark] class FileLogger( /** * Flush the writer to disk manually. * - * If the Hadoop FileSystem is used, the underlying FSDataOutputStream (r1.0.4) must be - * sync()'ed manually as it does not support flush(), which is invoked by when higher - * level streams are flushed. + * When using a Hadoop filesystem, we need to invoke the hflush or sync + * method. In HDFS, hflush guarantees that the data gets to all the + * DataNodes. */ def flush() { writer.foreach(_.flush()) - hadoopDataStream.foreach(_.sync()) + hadoopDataStream.foreach(hadoopFlushMethod.invoke(_)) } /**