spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Boudnik <...@apache.org>
Subject Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark
Date Thu, 06 Mar 2014 17:51:29 GMT
On Tue, Feb 25, 2014 at 03:20PM, Evan Chan wrote:
> The correct way to exclude dependencies in SBT is actually to declare
> a dependency as "provided".   I'm not familiar with Maven or its

Yes, I believe this would be equivalent to the maven exclusion of an
artifact's transitive deps.

Cos

> dependencySet, but provided will mark the entire dependency tree as
> excluded.   It is also possible to exclude jar by jar, but this is
> pretty error prone and messy.
> 
> On Tue, Feb 25, 2014 at 2:45 PM, Koert Kuipers <koert@tresata.com> wrote:
> > yes in sbt assembly you can exclude jars (although i never had a need for
> > this) and files in jars.
> >
> > for example i frequently remove log4j.properties, because for whatever
> > reason hadoop decided to include it making it very difficult to use our own
> > logging config.
> >
> >
> >
> > On Tue, Feb 25, 2014 at 4:24 PM, Konstantin Boudnik <cos@apache.org> wrote:
> >
> >> On Fri, Feb 21, 2014 at 11:11AM, Patrick Wendell wrote:
> >> > Kos - thanks for chiming in. Could you be more specific about what is
> >> > available in maven and not in sbt for these issues? I took a look at
> >> > the bigtop code relating to Spark. As far as I could tell [1] was the
> >> > main point of integration with the build system (maybe there are other
> >> > integration points)?
> >> >
> >> > >   - in order to integrate Spark well into existing Hadoop stack it
was
> >> > >     necessary to have a way to avoid transitive dependencies
> >> duplications and
> >> > >     possible conflicts.
> >> > >
> >> > >     E.g. Maven assembly allows us to avoid adding _all_ Hadoop libs
> >> and later
> >> > >     merely declare Spark package dependency on standard Bigtop Hadoop
> >> > >     packages. And yes - Bigtop packaging means the naming and layout
> >> would be
> >> > >     standard across all commercial Hadoop distributions that are worth
> >> > >     mentioning: ASF Bigtop convenience binary packages, and Cloudera
or
> >> > >     Hortonworks packages. Hence, the downstream user doesn't need
to
> >> spend any
> >> > >     effort to make sure that Spark "clicks-in" properly.
> >> >
> >> > The sbt build also allows you to plug in a Hadoop version similar to
> >> > the maven build.
> >>
> >> I am actually talking about an ability to exclude a set of dependencies
> >> from an
> >> assembly, similarly to what's happening in dependencySet sections of
> >>     assembly/src/main/assembly/assembly.xml
> >> If there is a comparable functionality in Sbt, that would help quite a bit,
> >> apparently.
> >>
> >> Cos
> >>
> >> > >   - Maven provides a relatively easy way to deal with the jar-hell
> >> problem,
> >> > >     although the original maven build was just Shader'ing everything
> >> into a
> >> > >     huge lump of class files. Oftentimes ending up with classes
> >> slamming on
> >> > >     top of each other from different transitive dependencies.
> >> >
> >> > AFIAK we are only using the shade plug-in to deal with conflict
> >> > resolution in the assembly jar. These are dealt with in sbt via the
> >> > sbt assembly plug-in in an identical way. Is there a difference?
> >>
> >> I am bringing up the Sharder, because it is an awful hack, which is can't
> >> be
> >> used in real controlled deployment.
> >>
> >> Cos
> >>
> >> > [1]
> >> https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-packages/src/common/spark/do-component-build;h=428540e0f6aa56cd7e78eb1c831aa7fe9496a08f;hb=master
> >>
> 
> 
> 
> -- 
> --
> Evan Chan
> Staff Engineer
> ev@ooyala.com  |

Mime
View raw message