spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kay Ousterhout (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-19988) Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive
Date Fri, 17 Mar 2017 02:14:41 GMT

    [ https://issues.apache.org/jira/browse/SPARK-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15929334#comment-15929334
] 

Kay Ousterhout commented on SPARK-19988:
----------------------------------------

With some help from [~joshrosen] I spent some time digging into this and found:

(1) if you look at the failures, they're all from the maven build.  In fact, 100% of the maven
builds shown there fail (and none of the SBT ones).  This is weird because this is also failing
on the PR builder, which uses SBT. 

(2) The maven build failures are all accompanied by 3 other tests; the group of 4 tests seems
to consistently fail together.  3 tests fail with errors similar to this one (saying that
some database does not exist).  The 4th test, org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite:
create temporary view using, fails with a more real error.  I filed SPARK-19990 for that issue.

(3) A commit right around the time the tests started failing: https://github.com/apache/spark/commit/09829be621f0f9bb5076abb3d832925624699fa9#diff-b7094baa12601424a5d19cb930e3402fR46
added code to remove all of the databases after each test.  I wonder if that's somehow getting
run concurrently or asynchronously in the maven build (after the HiveCataloguedDDLSuite fails),
which is why the error in the DDLSuite causes the other tests to fail saying that a database
can't be found.  I have extremely limited knowledge of both (a) how the maven tests are executed
and (b) the SQL code so it's possible these are totally unrelated issues.

None of this explains why the test is failing in the PR builder, where the failures have been
isolated to this test.

> Flaky Test: OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written
by Hive
> --------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19988
>                 URL: https://issues.apache.org/jira/browse/SPARK-19988
>             Project: Spark
>          Issue Type: Test
>          Components: SQL, Tests
>    Affects Versions: 2.2.0
>            Reporter: Imran Rashid
>              Labels: flaky-test
>         Attachments: trimmed-unit-test.log
>
>
> "OrcSourceSuite SPARK-19459/SPARK-18220: read char/varchar column written by Hive" fails
a lot -- right now, I see about a 50% pass rate in the last 3 days here:
> https://spark-tests.appspot.com/test-details?suite_name=org.apache.spark.sql.hive.orc.OrcSourceSuite&test_name=SPARK-19459%2FSPARK-18220%3A+read+char%2Fvarchar+column+written+by+Hive
> eg. https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74683/testReport/junit/org.apache.spark.sql.hive.orc/OrcSourceSuite/SPARK_19459_SPARK_18220__read_char_varchar_column_written_by_Hive/
> {noformat}
> sbt.ForkMain$ForkError: org.apache.spark.sql.execution.QueryExecutionException: FAILED:
SemanticException [Error 10072]: Database does not exist: db2
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:637)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$runHive$1.apply(HiveClientImpl.scala:621)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:288)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:229)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:228)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:271)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.runHive(HiveClientImpl.scala:621)
> 	at org.apache.spark.sql.hive.client.HiveClientImpl.runSqlHive(HiveClientImpl.scala:611)
> 	at org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply$mcV$sp(OrcSourceSuite.scala:160)
> 	at org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> 	at org.apache.spark.sql.hive.orc.OrcSuite$$anonfun$7.apply(OrcSourceSuite.scala:155)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message