spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gurwls...@apache.org
Subject spark git commit: [SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths
Date Wed, 30 Aug 2017 12:36:33 GMT
Repository: spark
Updated Branches:
  refs/heads/master 734ed7a7b -> b30a11a6a


[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect
paths

## What changes were proposed in this pull request?

`org.apache.spark.deploy.RPackageUtilsSuite`

```
 - jars without manifest return false *** FAILED *** (109 milliseconds)
   java.io.IOException: Unable to delete file: C:\projects\spark\target\tmp\1500266936418-0\dep1-c.jar
```

`org.apache.spark.deploy.SparkSubmitSuite`

```
 - download one file to local *** FAILED *** (16 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2630198944759847458.jar

 - download list of files to local *** FAILED *** (0 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2783551769392880031.jar
```

`org.apache.spark.scheduler.ReplayListenerSuite`

```
 - Replay compressed inprogress log file succeeding on partial read (156 milliseconds)
   Exception encountered when attempting to run a suite with class name:
   org.apache.spark.scheduler.ReplayListenerSuite *** ABORTED *** (1 second, 391 milliseconds)
   java.io.IOException: Failed to delete: C:\projects\spark\target\tmp\spark-8f3cacd6-faad-4121-b901-ba1bba8025a0

 - End-to-end replay *** FAILED *** (62 milliseconds)
   java.io.IOException: No FileSystem for scheme: C

 - End-to-end replay with compression *** FAILED *** (110 milliseconds)
   java.io.IOException: No FileSystem for scheme: C
```

`org.apache.spark.sql.hive.StatisticsSuite`

```
 - SPARK-21079 - analyze table with location different than that of individual partitions
*** FAILED *** (875 milliseconds)
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty
string);

 - SPARK-21079 - analyze partitioned table with only a subset of partitions visible *** FAILED
*** (47 milliseconds)
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException:
MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty
string);
```

**Note:** this PR does not fix:

`org.apache.spark.deploy.SparkSubmitSuite`

```
 - launch simple application with spark-submit with redaction *** FAILED *** (172 milliseconds)
   java.util.NoSuchElementException: next on empty iterator
```

I can't reproduce this on my Windows machine but looks appearntly consistently failed on AppVeyor.
This one is unclear to me yet and hard to debug so I did not include this one for now.

**Note:** it looks there are more instances but it is hard to identify them partly due to
flakiness and partly due to swarming logs and errors. Will probably go one more time if it
is fine.

## How was this patch tested?

Manually via AppVeyor:

**Before**

- `org.apache.spark.deploy.RPackageUtilsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/8t8ra3lrljuir7q4
- `org.apache.spark.deploy.SparkSubmitSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/taquy84yudjjen64
- `org.apache.spark.scheduler.ReplayListenerSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/24omrfn2k0xfa9xq
- `org.apache.spark.sql.hive.StatisticsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/2079y1plgj76dc9l

**After**

- `org.apache.spark.deploy.RPackageUtilsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/3803dbfn89ne1164
- `org.apache.spark.deploy.SparkSubmitSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/m5l350dp7u9a4xjr
- `org.apache.spark.scheduler.ReplayListenerSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/565vf74pp6bfdk18
- `org.apache.spark.sql.hive.StatisticsSuite`: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/qm78tsk8c37jb6s4

Jenkins tests are required and AppVeyor tests will be triggered.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #18971 from HyukjinKwon/windows-fixes.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b30a11a6
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b30a11a6
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b30a11a6

Branch: refs/heads/master
Commit: b30a11a6acf4b1512b5759f21ae58e69662ba455
Parents: 734ed7a
Author: hyukjinkwon <gurwls223@gmail.com>
Authored: Wed Aug 30 21:35:52 2017 +0900
Committer: hyukjinkwon <gurwls223@gmail.com>
Committed: Wed Aug 30 21:35:52 2017 +0900

----------------------------------------------------------------------
 .../spark/deploy/RPackageUtilsSuite.scala       |  7 +--
 .../apache/spark/deploy/SparkSubmitSuite.scala  |  4 +-
 .../spark/scheduler/ReplayListenerSuite.scala   | 53 +++++++++++---------
 .../apache/spark/sql/hive/StatisticsSuite.scala |  6 +--
 4 files changed, 39 insertions(+), 31 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
----------------------------------------------------------------------
diff --git a/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala b/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
index 5e0bf6d..32dd3ec 100644
--- a/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala
@@ -137,9 +137,10 @@ class RPackageUtilsSuite
     IvyTestUtils.withRepository(main, None, None) { repo =>
       val jar = IvyTestUtils.packJar(new File(new URI(repo)), dep1, Nil,
         useIvyLayout = false, withR = false, None)
-      val jarFile = new JarFile(jar)
-      assert(jarFile.getManifest == null, "jar file should have null manifest")
-      assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should return false")
+      Utils.tryWithResource(new JarFile(jar)) { jarFile =>
+        assert(jarFile.getManifest == null, "jar file should have null manifest")
+        assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should return false")
+      }
     }
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
----------------------------------------------------------------------
diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
index 724096d..7400ceb 100644
--- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
@@ -831,7 +831,7 @@ class SparkSubmitSuite
     val hadoopConf = new Configuration()
     val tmpDir = Files.createTempDirectory("tmp").toFile
     updateConfWithFakeS3Fs(hadoopConf)
-    val sourcePath = s"s3a://${jarFile.getAbsolutePath}"
+    val sourcePath = s"s3a://${jarFile.toURI.getPath}"
     val outputPath = DependencyUtils.downloadFile(sourcePath, tmpDir, sparkConf, hadoopConf,
       new SecurityManager(sparkConf))
     checkDownloadedFile(sourcePath, outputPath)
@@ -847,7 +847,7 @@ class SparkSubmitSuite
     val hadoopConf = new Configuration()
     val tmpDir = Files.createTempDirectory("tmp").toFile
     updateConfWithFakeS3Fs(hadoopConf)
-    val sourcePaths = Seq("/local/file", s"s3a://${jarFile.getAbsolutePath}")
+    val sourcePaths = Seq("/local/file", s"s3a://${jarFile.toURI.getPath}")
     val outputPaths = DependencyUtils
       .downloadFileList(sourcePaths.mkString(","), tmpDir, sparkConf, hadoopConf,
         new SecurityManager(sparkConf))

http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
----------------------------------------------------------------------
diff --git a/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala b/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
index 88a68af..d17e386 100644
--- a/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
+++ b/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala
@@ -21,6 +21,7 @@ import java.io._
 import java.net.URI
 import java.util.concurrent.atomic.AtomicInteger
 
+import org.apache.hadoop.fs.Path
 import org.json4s.jackson.JsonMethods._
 import org.scalatest.BeforeAndAfter
 
@@ -84,24 +85,23 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with
LocalSp
     val buffered = new ByteArrayOutputStream
     val codec = new LZ4CompressionCodec(new SparkConf())
     val compstream = codec.compressedOutputStream(buffered)
-    val writer = new PrintWriter(compstream)
+    Utils.tryWithResource(new PrintWriter(compstream)) { writer =>
 
-    val applicationStart = SparkListenerApplicationStart("AppStarts", None,
-      125L, "Mickey", None)
-    val applicationEnd = SparkListenerApplicationEnd(1000L)
+      val applicationStart = SparkListenerApplicationStart("AppStarts", None,
+        125L, "Mickey", None)
+      val applicationEnd = SparkListenerApplicationEnd(1000L)
 
-    // scalastyle:off println
-    writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationStart))))
-    writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationEnd))))
-    // scalastyle:on println
-    writer.close()
+      // scalastyle:off println
+      writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationStart))))
+      writer.println(compact(render(JsonProtocol.sparkEventToJson(applicationEnd))))
+      // scalastyle:on println
+    }
 
     val logFilePath = Utils.getFilePath(testDir, "events.lz4.inprogress")
-    val fstream = fileSystem.create(logFilePath)
     val bytes = buffered.toByteArray
-
-    fstream.write(bytes, 0, buffered.size)
-    fstream.close
+    Utils.tryWithResource(fileSystem.create(logFilePath)) { fstream =>
+      fstream.write(bytes, 0, buffered.size)
+    }
 
     // Read the compressed .inprogress file and verify only first event was parsed.
     val conf = EventLoggingListenerSuite.getLoggingConf(logFilePath)
@@ -112,17 +112,19 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter
with LocalSp
 
     // Verify the replay returns the events given the input maybe truncated.
     val logData = EventLoggingListener.openEventLog(logFilePath, fileSystem)
-    val failingStream = new EarlyEOFInputStream(logData, buffered.size - 10)
-    replayer.replay(failingStream, logFilePath.toString, true)
+    Utils.tryWithResource(new EarlyEOFInputStream(logData, buffered.size - 10)) { failingStream
=>
+      replayer.replay(failingStream, logFilePath.toString, true)
 
-    assert(eventMonster.loggedEvents.size === 1)
-    assert(failingStream.didFail)
+      assert(eventMonster.loggedEvents.size === 1)
+      assert(failingStream.didFail)
+    }
 
     // Verify the replay throws the EOF exception since the input may not be truncated.
     val logData2 = EventLoggingListener.openEventLog(logFilePath, fileSystem)
-    val failingStream2 = new EarlyEOFInputStream(logData2, buffered.size - 10)
-    intercept[EOFException] {
-      replayer.replay(failingStream2, logFilePath.toString, false)
+    Utils.tryWithResource(new EarlyEOFInputStream(logData2, buffered.size - 10)) { failingStream2
=>
+      intercept[EOFException] {
+        replayer.replay(failingStream2, logFilePath.toString, false)
+      }
     }
   }
 
@@ -151,7 +153,10 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with
LocalSp
    * assumption that the event logging behavior is correct (tested in a separate suite).
    */
   private def testApplicationReplay(codecName: Option[String] = None) {
-    val logDirPath = Utils.getFilePath(testDir, "test-replay")
+    val logDir = new File(testDir.getAbsolutePath, "test-replay")
+    // Here, it creates `Path` from the URI instead of the absolute path for the explicit
file
+    // scheme so that the string representation of this `Path` has leading file scheme correctly.
+    val logDirPath = new Path(logDir.toURI)
     fileSystem.mkdirs(logDirPath)
 
     val conf = EventLoggingListenerSuite.getLoggingConf(logDirPath, codecName)
@@ -221,12 +226,14 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter
with LocalSp
     def didFail: Boolean = countDown.get == 0
 
     @throws[IOException]
-    def read: Int = {
+    override def read(): Int = {
       if (countDown.get == 0) {
         throw new EOFException("Stream ended prematurely")
       }
       countDown.decrementAndGet()
-      in.read
+      in.read()
     }
+
+    override def close(): Unit = in.close()
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/b30a11a6/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
----------------------------------------------------------------------
diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
index dc61407..03e50e4 100644
--- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
+++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
@@ -203,7 +203,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
           sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * FROM src")
         }
 
-        sql(s"ALTER TABLE $tableName SET LOCATION '$path'")
+        sql(s"ALTER TABLE $tableName SET LOCATION '${path.toURI}'")
 
         sql(s"ANALYZE TABLE $tableName COMPUTE STATISTICS noscan")
 
@@ -222,7 +222,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
             s"""
                |CREATE TABLE $sourceTableName (key STRING, value STRING)
                |PARTITIONED BY (ds STRING)
-               |LOCATION '$path'
+               |LOCATION '${path.toURI}'
              """.stripMargin)
 
           val partitionDates = List("2010-01-01", "2010-01-02", "2010-01-03")
@@ -239,7 +239,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
             s"""
                |CREATE TABLE $tableName (key STRING, value STRING)
                |PARTITIONED BY (ds STRING)
-               |LOCATION '$path'
+               |LOCATION '${path.toURI}'
              """.stripMargin)
 
           // Register only one of the partitions found on disk


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message