spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saikat Kanjilal <sxk1...@hotmail.com>
Subject Re: Spark-9487, Need some insight
Date Tue, 06 Dec 2016 18:17:33 GMT
Well  other than making the code consistent whats the high level goal in doing this and why
does it matter so much how many workers we have in different scenarios (pyspark versus different
components of spark).  I'm ok not making the change and working on something else to be honest
but spending hours troubleshooting issues in a local dev environment that doesnt resemble
jenkins closely enough is not a productive use of time.  Would love to get input on next logical
steps.


________________________________
From: Reynold Xin <rxin@databricks.com>
Sent: Monday, December 5, 2016 6:44 PM
To: Saikat Kanjilal
Cc: dev@spark.apache.org
Subject: Re: Spark-9487, Need some insight

Honestly it is pretty difficult. Given the difficulty, would it still make sense to do that
change? (the one that sets the same number of workers/parallelism across different languages
in testing)


On Mon, Dec 5, 2016 at 3:33 PM, Saikat Kanjilal <sxk1969@hotmail.com<mailto:sxk1969@hotmail.com>>
wrote:

Hello again dev community,

Ping on this, apologies for rerunning this thread but never heard from anyone, based on this
link:  https://wiki.jenkins-ci.org/display/JENKINS/Installing+Jenkins  I can try to install
jenkins locally but is that really needed?


Thanks in advance.


________________________________
From: Saikat Kanjilal <sxk1969@hotmail.com<mailto:sxk1969@hotmail.com>>
Sent: Tuesday, November 29, 2016 8:14 PM
To: dev@spark.apache.org<mailto:dev@spark.apache.org>
Subject: Spark-9487, Need some insight


Hello Spark dev community,

I took this the following jira item (https://github.com/apache/spark/pull/15848) and am looking
for some general pointers, it seems that I am running into issues where things work successfully
doing local development on my macbook pro but fail on jenkins for a multitiude of reasons
and errors, here's an example,  if you see this build output report: https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69297/
you will see the DataFrameStatSuite, now locally I am running these individual tests with
this command: ./build/mvn test -P... -DwildcardSuites=none -Dtest=org.apache.spark.sql.DataFrameStatSuite.
    It seems that I need to emulate a jenkins like environment locally, this seems sort of
like an untenable hurdle, granted that my changes involve changing the total number of workers
in the sparkcontext and if so should I be testing my changes in an environment that more closely
resembles jenkins.  I really want to work on/complete this PR but I keep getting hamstrung
by a dev environment that is not equivalent to our CI environment.



I'm guessing/hoping I'm not the first one to run into this so some insights. pointers to get
past this would be very appreciated , would love to keep contributing and hoping this is a
hurdle that's overcomeable with some tweaks to my dev environment.



Thanks in advance.


Mime
View raw message