spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-16139) Audit tests for leaked threads
Date Tue, 05 Dec 2017 11:30:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-16139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-16139:
------------------------------------

    Assignee: Apache Spark

> Audit tests for leaked threads
> ------------------------------
>
>                 Key: SPARK-16139
>                 URL: https://issues.apache.org/jira/browse/SPARK-16139
>             Project: Spark
>          Issue Type: Test
>          Components: Tests
>    Affects Versions: 2.0.0
>            Reporter: Imran Rashid
>            Assignee: Apache Spark
>
> Lots of our tests don't properly shutdown everything they create, and end up leaking
lots of threads.  For example, {{TaskSetManagerSuite}} doesn't stop the extra {{TaskScheduler}}
and {{DAGScheduler}} it creates.  There are a couple more instances I've run into recently,
eg. in [{{DAGSchedulerSuite}}|https://github.com/apache/spark/commit/cf1995a97645f0b44c997f4fdbba631fd6b91a16#diff-f3b410b16818d8f34bb1eb4120a60d51R235
]
> I'm fixing these piecemeal when I see them (eg., TaskSetManagerSuite should be fixed
by my pr for SPARK-16136), but it would be great to have a comprehensive audit and fix this
across all tests.
> This should be semi-automatable.  In {{SparkFunSuite}}, you could grab all threads before
the tests starts, and after it completes.  Then you could clearly log all threads started
after the test started but still going.  Unfortunately this isn't perfect, it seems that netty
threads aren't killed immediately on shutdown, . Its OK if some of them linger beyond the
test, so you may need to do some whitelisting based on thread-name & a little more manual
inspection.  But you could at least clearly log the relevant info, so that after a jenkins
run you could easily pull the info from the logs.
> Bonus points if you can figure out some way to make this output visible outside of the
logs, ideally even in the test report that makes it to github, but that isn't necessary, and
unless its very easy probably best for a separate task.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message