spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew McClain (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-20369) pyspark: Dynamic configuration with SparkConf does not work
Date Tue, 18 Apr 2017 15:25:41 GMT
Matthew McClain created SPARK-20369:
---------------------------------------

             Summary: pyspark: Dynamic configuration with SparkConf does not work
                 Key: SPARK-20369
                 URL: https://issues.apache.org/jira/browse/SPARK-20369
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.1.0
         Environment: Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-40-generic x86_64) and Mac OS X
10.11.6
            Reporter: Matthew McClain
            Priority: Minor


Setting spark properties dynamically in pyspark using SparkConf object does not work. Here
is the code that shows the bug:
---
from pyspark import SparkContext, SparkConf

def main():

    conf = SparkConf().setAppName("spark-conf-test") \
        .setMaster("local[2]") \
        .set('spark.python.worker.memory',"1g") \
        .set('spark.executor.memory',"3g") \
        .set("spark.driver.maxResultSize","2g")

    print "Spark Config values in SparkConf:"
    print conf.toDebugString()

    sc = SparkContext(conf=conf)

    print "Actual Spark Config values:"
    print sc.getConf().toDebugString()

if __name__  == "__main__":
    main()
---

Here is the output; none of the config values set in SparkConf are used in the SparkContext
configuration:

Spark Config values in SparkConf:
spark.master=local[2]
spark.executor.memory=3g
spark.python.worker.memory=1g
spark.app.name=spark-conf-test
spark.driver.maxResultSize=2g
17/04/18 10:21:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform...
using builtin-java classes where applicable
Actual Spark Config values:
spark.app.id=local-1492528885708
spark.app.name=sandbox.py
spark.driver.host=10.201.26.172
spark.driver.maxResultSize=4g
spark.driver.port=54657
spark.executor.id=driver
spark.files=file:/Users/matt.mcclain/dev/datascience-experiments/mmcclain/client_clusters/sandbox.py
spark.master=local[*]
spark.rdd.compress=True
spark.serializer.objectStreamReset=100
spark.submit.deployMode=client





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message