spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bogdan Raducanu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-20407) ParquetQuerySuite flaky test
Date Thu, 20 Apr 2017 10:23:04 GMT

     [ https://issues.apache.org/jira/browse/SPARK-20407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bogdan Raducanu updated SPARK-20407:
------------------------------------
    Description: 
ParquetQuerySuite test "Enabling/disabling ignoreCorruptFiles" can sometimes fail. This is
caused by the fact that when one task fails, the driver call returns and test code continues,
but there might still be tasks running that will be killed at the next killing point.

There are 2 specific issues created by this:
1. Files can be closed some time after the test finishes, so DebugFilesystem.assertNoOpenStreams
fails. One solution for this is to change SharedSqlContext and call assertNoOpenStreams inside
eventually {}

2. ParquetFileReader constructor from apache parquet 1.8.2 can leak a stream at line 538.
This happens when the next line throws an exception. So, the constructor fails and Spark doesn't
have any way to close the file.
This happens in this test because the test deletes the temporary directory at the end (but
while tasks might still be running). Deleting the directory causes the constructor to fail.
The solution for this could be to Thread.sleep at the end of the test or to somehow wait for
all tasks to be definitely killed before finishing the test

  was:
ParquetQuerySuite test "Enabling/disabling ignoreCorruptFiles" can sometimes fail. This is
caused by the fact that when one task fails the driver call returns and test code continues,
but there might still be tasks running that will be killed at the next killing point.

There are 2 specific issues creates by this:
1. Files are closed after the test finishes, so DebugFilesystem.assertNoOpenStreams fails.
One solution for this is to change SharedSqlContext and call assertNoOpenStreams inside eventually
{}

2. ParquetFileReader constructor from apache parquet 1.8.2 can leak a stream at line 538.
This happens when the next line throws an exception. So, the constructor fails and Spark doesn't
have any way to close the file.
This happens in this test because the test deletes the temporary directory at the end (but
while tasks might still be running). Deleting the directory causes the constructor to fail.
The solution for this could be to Thread.sleep at the end of the test or to somehow wait for
all tasks to be definitely killed before finishing the test


> ParquetQuerySuite flaky test
> ----------------------------
>
>                 Key: SPARK-20407
>                 URL: https://issues.apache.org/jira/browse/SPARK-20407
>             Project: Spark
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 2.2.0
>            Reporter: Bogdan Raducanu
>
> ParquetQuerySuite test "Enabling/disabling ignoreCorruptFiles" can sometimes fail. This
is caused by the fact that when one task fails, the driver call returns and test code continues,
but there might still be tasks running that will be killed at the next killing point.
> There are 2 specific issues created by this:
> 1. Files can be closed some time after the test finishes, so DebugFilesystem.assertNoOpenStreams
fails. One solution for this is to change SharedSqlContext and call assertNoOpenStreams inside
eventually {}
> 2. ParquetFileReader constructor from apache parquet 1.8.2 can leak a stream at line
538. This happens when the next line throws an exception. So, the constructor fails and Spark
doesn't have any way to close the file.
> This happens in this test because the test deletes the temporary directory at the end
(but while tasks might still be running). Deleting the directory causes the constructor to
fail.
> The solution for this could be to Thread.sleep at the end of the test or to somehow wait
for all tasks to be definitely killed before finishing the test



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message