spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Baptiste Onofré>
Subject Re: A proposal for Spark 2.0
Date Wed, 11 Nov 2015 04:45:53 GMT

I fully agree that. Actually, I'm working on PR to add "client" and 
"exploded" profiles in Maven build.

The client profile create a spark-client-assembly jar, largely more 
lightweight that the spark-assembly. In our case, we construct jobs that 
don't require all the spark server side. It means that the minimal size 
of the generated jar is about 120MB, and it's painful in spark-submit 
submission time. That's why I started to remove unecessary dependencies 
in spark-assembly.

On the other hand, I'm also working on the "exploded" mode: instead of 
using a fat monolithic spark-assembly jar file, I'm working on a 
exploded mode, allowing users to view/change the dependencies.

For the client profile, I've already something ready, I will propose the 
PR very soon (by the end of this week hopefully). For the exploded 
profile, I need more time.

My $0.02


On 11/11/2015 12:53 AM, Reynold Xin wrote:
> On Tue, Nov 10, 2015 at 3:35 PM, Nicholas Chammas
> < <>> wrote:
>     > 3. Assembly-free distribution of Spark: don’t require building an enormous
assembly jar in order to run Spark.
>     Could you elaborate a bit on this? I'm not sure what an
>     assembly-free distribution means.
> Right now we ship Spark using a single assembly jar, which causes a few
> different problems:
> - total number of classes are limited on some configurations
> - dependency swapping is harder
> The proposal is to just avoid a single fat jar.

Jean-Baptiste Onofré
Talend -

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message