spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harish (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-20362) spark submit not considering user defined Configs (Pyspark)
Date Mon, 17 Apr 2017 22:26:41 GMT
Harish created SPARK-20362:
------------------------------

             Summary: spark submit not considering user defined Configs (Pyspark)
                 Key: SPARK-20362
                 URL: https://issues.apache.org/jira/browse/SPARK-20362
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.1.0
            Reporter: Harish


I am trying to set up the custom configuration on runtime (pyspark), but in my spark UI <ip>:8080
i see my job is using complete node/cluster resources and application name is "test.py"(which
is script name). It looks like the user defined  configurations are not considered in job
submit.

command : spark-submit test.py 
standalone mode(2 nodes and 1 master)

Here is the code:

test.py
from pyspark.sql import SparkSession
from pyspark import SparkConf

if __name__ == "__main__":
    conf = SparkConf().setAll([('spark.executor.memory', '8g'), ('spark.executor.cores', '3'),
('spark.cores.max', '10'), ('spark.driver.memory','8g')])
    spark = SparkSession.builder.config(conf=conf).enableHiveSupport().getOrCreate()
    sc = spark.sparkContext
    print(sc.getConf().getAll())
    sqlContext = SQLContext(sc)
    hiveContext = HiveContext(sc)
    print(hiveContext)
    print(sc.getConf().getAll())
    print("Complete")


Print:

[('spark.jars.packages', 'com.databricks:spark-csv_2.11:1.2.0'), ('spark.local.dir', '/mnt/sparklocaldir/'),
('hive.metastore.warehouse.dir', '<path>'), ('spark.app.id', 'app-20170417221942-0003'),
('spark.jars', 'file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'),
('spark.executor.id', 'driver'), ('spark.app.name', 'test.py'), ('spark.cores.max', '10'),
('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'), ('spark.driver.port',
'35596'), ('spark.sql.catalogImplementation', 'hive'), ('spark.sql.warehouse.dir', '<path>'),
('spark.rdd.compress', 'True'), ('spark.driver.memory', '8g'), ('spark.serializer.objectStreamReset',
'100'), ('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.submit.deployMode',
'client'), ('spark.files', 'file:/home/user/test.py,file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'),
('spark.master', 'spark://master:7077'), ('spark.submit.pyFiles', '/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'),
('spark.driver.host', 'master')]

<pyspark.sql.context.HiveContext object at 0x7f6f87b2e5f8>

[('spark.jars.packages', 'com.databricks:spark-csv_2.11:1.2.0'), ('spark.local.dir', '/mnt/sparklocaldir/'),
('hive.metastore.warehouse.dir', '<path>'), ('spark.app.id', 'app-20170417221942-0003'),
('spark.jars', 'file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'),
('spark.executor.id', 'driver'), ('spark.app.name', 'test.py'), ('spark.cores.max', '10'),
('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'), ('spark.driver.port',
'35596'), ('spark.sql.catalogImplementation', 'hive'), ('spark.sql.warehouse.dir', '<path>'),
('spark.rdd.compress', 'True'), ('spark.driver.memory', '8g'), ('spark.serializer.objectStreamReset',
'100'), ('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.submit.deployMode',
'client'), ('spark.files', 'file:/home/user/test.py,file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'),
('spark.master', 'spark://master:7077'), ('spark.submit.pyFiles', '/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'),
('spark.driver.host', 'master')]






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message