spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "xinzhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-22007) spark-submit on yarn or local , got different result
Date Thu, 14 Sep 2017 09:09:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-22007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16165972#comment-16165972
] 

xinzhang commented on SPARK-22007:
----------------------------------

ye .i figure it out.
add this with instance sparkSession
.config("hive.metastore.uris", "thrift://11.11.11.11:9083") \

maybe the web here should describe more detail.
https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.SparkSession.Builder


> spark-submit on yarn or local , got different result
> ----------------------------------------------------
>
>                 Key: SPARK-22007
>                 URL: https://issues.apache.org/jira/browse/SPARK-22007
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, Spark Shell, Spark Submit
>    Affects Versions: 2.1.0
>            Reporter: xinzhang
>
> submit the py script on local.
> /opt/spark/spark-bin/bin/spark-submit --master local cluster test_hive.py
> result:
> +------------+
> |databaseName|
> +------------+
> |     default|
> |         zzzz|
> |       xxxxx|
> +------------+
> submit the py script on yarn.
> /opt/spark/spark-bin/bin/spark-submit --master yarn --deploy-mode cluster test_hive.py
> result:
> +------------+
> |databaseName|
> +------------+
> |     default|
> +------------+
> the py script :
> [yangtt@dc-gateway119 test]$ cat test_hive.py 
> #!/usr/bin/env python
> #coding=utf-8
> from os.path import expanduser, join, abspath
> from pyspark.sql import SparkSession
> from pyspark.sql import Row
> from pyspark.conf import SparkConf
> def squared(s):
>   return s * s
> warehouse_location = abspath('/group/user/yangtt/meta/hive-temp-table')
> spark = SparkSession \
>     .builder \
>     .appName("Python_Spark_SQL_Hive") \
>     .config("spark.sql.warehouse.dir", warehouse_location) \
>     .config(conf=SparkConf()) \
>     .enableHiveSupport() \
>     .getOrCreate()
> spark.udf.register("squared",squared)
> spark.sql("show databases").show()
> Q:why the spark load the different hive metastore
> the yarn always use the DERBY?
> 17/09/14 16:10:55 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
> my current metastore is in mysql.
> any suggest will be helpful.
> thanks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message