spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen He <airb...@gmail.com>
Subject Re: Intro to using IntelliJ to debug SPARK-1.1 Apps with mvn/sbt (for beginners)
Date Wed, 19 Nov 2014 04:26:56 GMT
Thank you Yiming. It is helpful.

Regards!

Chen

On Tue, Nov 18, 2014 at 8:00 PM, Yiming (John) Zhang <sdiris@gmail.com>
wrote:

> Hi,
>
>
>
> I noticed it is hard to find a thorough introduction to using IntelliJ to
> debug SPARK-1.1 Apps with mvn/sbt, which is not straightforward for
> beginners. So I spent several days to figure it out and hope that it would
> be helpful for beginners like me and that professionals can help me improve
> it. (The intro with figures can be found at:
> http://kylinx.com/spark/Debug-Spark-in-IntelliJ.htm)
>
>
>
> (1) Install the Scala plugin
>
>
>
> (2) Download, unzip and open spark-1.1.0 in IntelliJ
>
> a) mvn: File -> Open.
>
>     Select the Spark source folder (e.g., /root/spark-1.1.0). Maybe it will
> take a long time to download and compile a lot of things
>
> b) sbt: File -> Import Project.
>
>     Select "Import project from external model", then choose SBT project,
> click Next. Input the Spark source path (e.g., /root/spark-1.1.0) for "SBT
> project", and select Use auto-import.
>
>
>
> (3) First compile and run spark examples in the console to ensure
> everything
> OK
>
> # mvn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package
>
> # ./sbt/sbt assembly -Phadoop-2.2 -Dhadoop.version=2.2.0
>
>
>
> (4) Add the compiled spark-hadoop library
> (spark-assembly-1.1.0-hadoop2.2.0)
> to "Libraries" (File -> Project Structure. -> Libraries -> green +). And
> choose modules that use it (right-click the library and click "Add to
> Modules"). It seems only spark-examples need it.
>
>
>
> (5) In the "Dependencies" page of the modules using this library, ensure
> that the "Scope" of this library is "Compile" (File -> Project Structure.
> ->
> Modules)
>
> (6) For sbt, it seems that we have to label the scope of all other hadoop
> dependencies (SBT: org.apache.hadoop.hadoop-*) as "Test" (due to poor
> Internet connection?) And this has to be done every time opening IntelliJ
> (due to a bug?)
>
>
>
> (7) Configure debug environment (using LogQuery as an example). Run -> Edit
> Configurations.
>
> Main class: org.apache.spark.examples.LogQuery
>
> VM options: -Dspark.master=local
>
> Working directory: /root/spark-1.1.0
>
> Use classpath of module: spark-examples_2.10
>
> Before launch: External tool: mvn
>
>     Program: /root/Programs/apache-maven-3.2.1/bin/mvn
>
>     Parameters: -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests package
>
>     Working directory: /root/spark-1.1.0
>
> Before launch: External tool: sbt
>
>     Program: /root/spark-1.1.0/sbt/sbt
>
>     Parameters: -Phadoop-2.2 -Dhadoop.version=2.2.0 assembly
>
>     Working directory: /root/spark-1.1.0
>
>
>
> (8) Click Run -> Debug 'LogQuery' to start debugging
>
>
>
>
>
> Cheers,
>
> Yiming
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message