spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Armbrust (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-8966) Design a mechanism to ensure that temporary files created in tasks are cleaned up after failures
Date Tue, 01 Dec 2015 05:06:11 GMT

     [ https://issues.apache.org/jira/browse/SPARK-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael Armbrust updated SPARK-8966:
------------------------------------
    Target Version/s:   (was: 1.6.0)

> Design a mechanism to ensure that temporary files created in tasks are cleaned up after
failures
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-8966
>                 URL: https://issues.apache.org/jira/browse/SPARK-8966
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>            Reporter: Josh Rosen
>
> It's important to avoid leaking temporary files, such as spill files created by the external
sorter.  Individual operators should still make an effort to clean up their own files / perform
their own error handling, but I think that we should add a safety-net mechanism to track file
creation on a per-task basis and automatically clean up leaked files.
> During tests, this mechanism should throw an exception when a leak is detected. In production
deployments, it should log a warning and clean up the leak itself.  This is similar to the
TaskMemoryManager's leak detection and cleanup code.
> We may be able to implement this via a convenience method that registers task completion
handlers with TaskContext.
> We might also explore techniques that will cause files to be cleaned up automatically
when their file descriptors are closed (e.g. by calling unlink on an open file). These techniques
should not be our last line of defense against file resource leaks, though, since they might
be platform-specific and may clean up resources later than we'd like.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message