spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "carlmartin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-17233) Shuffle file will be left over the capacity when dynamic schedule is enabled in a long running case.
Date Thu, 25 Aug 2016 03:29:20 GMT
carlmartin created SPARK-17233:
----------------------------------

             Summary: Shuffle file will be left over the capacity when dynamic schedule is
enabled in a long running case.
                 Key: SPARK-17233
                 URL: https://issues.apache.org/jira/browse/SPARK-17233
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.0.0, 1.6.2, 1.5.2
            Reporter: carlmartin


When I execute some sql statement periodically in the long running thriftserver, I found the
disk device will be full after about one week later.
After check the file on linux, I found so many shuffle files left on the block-mgr dir whose
shuffle stage had finished long time ago.
Finally I find when it's need to clean shuffle file, driver will total each executor to do
the ShuffleClean. But when dynamic schedule is enabled, executor will be down itself and executor
can't clean its shuffle file, then file was left.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message