spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Feng Gui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-19779) structured streaming exist needless tmp file
Date Wed, 01 Mar 2017 16:46:45 GMT

    [ https://issues.apache.org/jira/browse/SPARK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890528#comment-15890528
] 

Feng Gui commented on SPARK-19779:
----------------------------------

[~srowen] The `Background maintenance` don't clean files started with `temp`, so I think the
temp file is not deleted. However, the temp file don't impact to get incorrect results for
Structured Streaming Job.

> structured streaming exist needless tmp file 
> ---------------------------------------------
>
>                 Key: SPARK-19779
>                 URL: https://issues.apache.org/jira/browse/SPARK-19779
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.1.0
>            Reporter: Feng Gui
>            Priority: Minor
>
> The PR (https://github.com/apache/spark/pull/17012) can to fix restart a Structured Streaming
application using hdfs as fileSystem, but also exist a problem that a tmp file of delta file
is still reserved in hdfs. And Structured Streaming don't delete the tmp file generated when
restart streaming job in future, so we need to delete the tmp file after restart streaming
job.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message