spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: Speeding up Spark build during development
Date Tue, 05 May 2015 05:35:02 GMT
In addition to Michael suggestion, in my SBT workflow I also use "~" to
automatically kickoff build and unit test. For example,

sbt/sbt "~streaming/test-only *BasicOperationsSuite*"

It will automatically detect any file changes in the project and start of
the compilation and testing.
So my full workflow involves changing code in IntelliJ and then
continuously running unit tests in the background on the command line using
this "~".

TD


On Mon, May 4, 2015 at 2:49 PM, Michael Armbrust <michael@databricks.com>
wrote:

> FWIW... My Spark SQL development workflow is usually to run "build/sbt
> sparkShell" or "build/sbt 'sql/test-only <testSuiteName>'".  These commands
> starts in as little as 30s on my laptop, automatically figure out which
> subprojects need to be rebuilt, and don't require the expensive assembly
> creation.
>
> On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew <meethu.mathew@flytxt.com>
> wrote:
>
> > *
> > *
> > ** ** ** ** ** **** ** **** Hi,
> >
> >  Is it really necessary to run **mvn --projects assembly/ -DskipTests
> > install ? Could you please explain why this is needed?
> > I got the changes after running "mvn --projects streaming/ -DskipTests
> > package".
> >
> > Regards,
> > Meethu
> >
> >
> > On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote:
> >
> >> Just to give you an example:
> >>
> >> When I was trying to make a small change only to the Streaming component
> >> of
> >> Spark, first I built and installed the whole Spark project (this took
> >> about
> >> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed
> >> files
> >> only in Streaming, I ran something like (in the top-level directory):
> >>
> >>     mvn --projects streaming/ -DskipTests package
> >>
> >> and then
> >>
> >>     mvn --projects assembly/ -DskipTests install
> >>
> >>
> >> This was much faster than trying to build the whole Spark from scratch,
> >> because Maven was only building one component, in my case the Streaming
> >> component, of Spark. I think you can use a very similar approach.
> >>
> >> --
> >> Emre Sevinç
> >>
> >>
> >>
> >> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri <
> >> pramodbiligiri@gmail.com>
> >> wrote:
> >>
> >>  No, I just need to build one project at a time. Right now SparkSql.
> >>>
> >>> Pramod
> >>>
> >>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc <emre.sevinc@gmail.com>
> >>> wrote:
> >>>
> >>>  Hello Pramod,
> >>>>
> >>>> Do you need to build the whole project every time? Generally you
> don't,
> >>>> e.g., when I was changing some files that belong only to Spark
> >>>> Streaming, I
> >>>> was building only the streaming (of course after having build and
> >>>> installed
> >>>> the whole project, but that was done only once), and then the
> assembly.
> >>>> This was much faster than trying to build the whole Spark every time.
> >>>>
> >>>> --
> >>>> Emre Sevinç
> >>>>
> >>>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri <
> >>>> pramodbiligiri@gmail.com
> >>>>
> >>>>> wrote:
> >>>>> Using the inbuilt maven and zinc it takes around 10 minutes for
each
> >>>>> build.
> >>>>> Is that reasonable?
> >>>>> My maven opts looks like this:
> >>>>> $ echo $MAVEN_OPTS
> >>>>> -Xmx12000m -XX:MaxPermSize=2048m
> >>>>>
> >>>>> I'm running it as build/mvn -DskipTests package
> >>>>>
> >>>>> Should I be tweaking my Zinc/Nailgun config?
> >>>>>
> >>>>> Pramod
> >>>>>
> >>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra <
> mark@clearstorydata.com>
> >>>>> wrote:
> >>>>>
> >>>>>
> >>>>>>
> >>>>>
> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
> >>>>>
> >>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
> >>>>>>
> >>>>> pramodbiligiri@gmail.com>
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>  This is great. I didn't know about the mvn script in the build
> >>>>>>>
> >>>>>> directory.
> >>>>>
> >>>>>> Pramod
> >>>>>>>
> >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
> >>>>>>> Brennon.York@capitalone.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>  Following what Ted said, if you leverage the `mvn` from
within the
> >>>>>>>> `build/` directory of Spark you¹ll get zinc for free
which should
> >>>>>>>>
> >>>>>>> help
> >>>>>
> >>>>>> speed up build times.
> >>>>>>>>
> >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu" <yuzhihong@gmail.com>
wrote:
> >>>>>>>>
> >>>>>>>>  Pramod:
> >>>>>>>>> Please remember to run Zinc so that the build is
faster.
> >>>>>>>>>
> >>>>>>>>> Cheers
> >>>>>>>>>
> >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> >>>>>>>>> <alexander.ulanov@hp.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>  Hi Pramod,
> >>>>>>>>>>
> >>>>>>>>>> For cluster-like tests you might want to use
the same code as in
> >>>>>>>>>>
> >>>>>>>>> mllib's
> >>>>>>>
> >>>>>>>> LocalClusterSparkContext. You can rebuild only the package
that
> >>>>>>>>>>
> >>>>>>>>> you
> >>>>>
> >>>>>> change
> >>>>>>>>>> and then run this main class.
> >>>>>>>>>>
> >>>>>>>>>> Best regards, Alexander
> >>>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbiligiri@gmail.com]
> >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM
> >>>>>>>>>> To: dev@spark.apache.org
> >>>>>>>>>> Subject: Speeding up Spark build during development
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>> I'm making some small changes to the Spark codebase
and trying
> >>>>>>>>>>
> >>>>>>>>> it out
> >>>>>
> >>>>>> on a
> >>>>>>>>>> cluster. I was wondering if there's a faster
way to build than
> >>>>>>>>>>
> >>>>>>>>> running
> >>>>>>>
> >>>>>>>> the
> >>>>>>>>>> package target each time.
> >>>>>>>>>> Currently I'm using: mvn -DskipTests  package
> >>>>>>>>>>
> >>>>>>>>>> All the nodes have the same filesystem mounted
at the same mount
> >>>>>>>>>>
> >>>>>>>>> point.
> >>>>>>>
> >>>>>>>> Pramod
> >>>>>>>>>>
> >>>>>>>>>>  ________________________________________________________
> >>>>>>>>
> >>>>>>>> The information contained in this e-mail is confidential
and/or
> >>>>>>>> proprietary to Capital One and/or its affiliates. The
information
> >>>>>>>> transmitted herewith is intended only for use by the
individual or
> >>>>>>>>
> >>>>>>> entity
> >>>>>>>
> >>>>>>>> to which it is addressed.  If the reader of this message
is not
> the
> >>>>>>>> intended recipient, you are hereby notified that any
review,
> >>>>>>>> retransmission, dissemination, distribution, copying
or other use
> >>>>>>>>
> >>>>>>> of, or
> >>>>>
> >>>>>> taking of any action in reliance upon this information is strictly
> >>>>>>>> prohibited. If you have received this communication
in error,
> please
> >>>>>>>> contact the sender and delete the material from your
computer.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>> --
> >>>> Emre Sevinc
> >>>>
> >>>>
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message