spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Goyal <>
Subject ClosureCleaner slowing down Spark SQL queries
Date Wed, 27 May 2015 17:38:20 GMT
Hi All,

I am running a SQL query (spark version 1.2) on a table created from
unionAll of 3 schema RDDs which gets executed in roughly 400ms (200ms at
driver and roughly 200ms at executors).

If I run same query on a table created from unionAll of 27 schema RDDS, I
see that executors time is same(because of concurrency and nature of my
query) but driver time shoots to 600ms (and total query time being = 600 +
200 = 800ms).

I attached JProfiler and found that ClosureCleaner clean method is taking
time at driver(some issue related to URLClassLoader) and it linearly
increases with number of RDDs being union-ed on which query is getting
fired. This is causing my query to take a huge amount of time where I expect
the query to be executed within 400ms irrespective of number of RDDs (since
I have executors available to cater my need). PFB the links of screenshots
from Jprofiler :-

Any help/suggestion to fix this will be highly appreciated since this needs
to be fixed for production

Thanks in Advance,

View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message