Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1E1A2200B24 for ; Wed, 18 May 2016 04:21:15 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1CD35160A23; Wed, 18 May 2016 02:21:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1CF5F160A24 for ; Wed, 18 May 2016 04:21:13 +0200 (CEST) Received: (qmail 89830 invoked by uid 500); 18 May 2016 02:21:13 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 89703 invoked by uid 99); 18 May 2016 02:21:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 May 2016 02:21:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 1D0062C1F5D for ; Wed, 18 May 2016 02:21:13 +0000 (UTC) Date: Wed, 18 May 2016 02:21:13 +0000 (UTC) From: "Apache Spark (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (SPARK-15345) SparkSession's conf doesn't take effect when this already an existing SparkContext MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 18 May 2016 02:21:15 -0000 [ https://issues.apache.org/jira/browse/SPARK-15345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-15345: ------------------------------------ Assignee: Apache Spark > SparkSession's conf doesn't take effect when this already an existing SparkContext > ---------------------------------------------------------------------------------- > > Key: SPARK-15345 > URL: https://issues.apache.org/jira/browse/SPARK-15345 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Affects Versions: 2.0.0 > Reporter: Piotr Milanowski > Assignee: Apache Spark > Priority: Blocker > > I am working with branch-2.0, spark is compiled with hive support (-Phive and -Phvie-thriftserver). > I am trying to access databases using this snippet: > {code} > from pyspark.sql import HiveContext > hc = HiveContext(sc) > hc.sql("show databases").collect() > [Row(result='default')] > {code} > This means that spark doesn't find any databases specified in configuration. > Using the same configuration (i.e. hive-site.xml and core-site.xml) in spark 1.6, and launching above snippet, I can print out existing databases. > When run in DEBUG mode this is what spark (2.0) prints out: > {code} > 16/05/16 12:17:47 INFO SparkSqlParser: Parsing command: show databases > 16/05/16 12:17:47 DEBUG SimpleAnalyzer: > === Result of Batch Resolution === > !'Project [unresolveddeserializer(createexternalrow(if (isnull(input[0, string])) null else input[0, string].toString, StructField(result,StringType,false)), result#2) AS #3] Project [createexternalrow(if (isnull(result#2)) null else result#2.toString, StructField(result,StringType,false)) AS #3] > +- LocalRelation [result#2] +- LocalRelation [result#2] > > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ Cleaning closure (org.apache.spark.sql.Dataset$$anonfun$53) +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared fields: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public static final long org.apache.spark.sql.Dataset$$anonfun$53.serialVersionUID > 16/05/16 12:17:47 DEBUG ClosureCleaner: private final org.apache.spark.sql.types.StructType org.apache.spark.sql.Dataset$$anonfun$53.structType$1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared methods: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.sql.Dataset$$anonfun$53.apply(java.lang.Object) > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.sql.Dataset$$anonfun$53.apply(org.apache.spark.sql.catalyst.InternalRow) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + inner classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer objects: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure > 16/05/16 12:17:47 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + there are no enclosing objects! > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ closure (org.apache.spark.sql.Dataset$$anonfun$53) is now cleaned +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ Cleaning closure (org.apache.spark.sql.execution.python.EvaluatePython$$anonfun$javaToPython$1) +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared fields: 1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public static final long org.apache.spark.sql.execution.python.EvaluatePython$$anonfun$javaToPython$1.serialVersionUID > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared methods: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.sql.execution.python.EvaluatePython$$anonfun$javaToPython$1.apply(java.lang.Object) > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final org.apache.spark.api.python.SerDeUtil$AutoBatchedPickler org.apache.spark.sql.execution.python.EvaluatePython$$anonfun$javaToPython$1.apply(scala.collection.Iterator) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + inner classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer objects: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure > 16/05/16 12:17:47 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + there are no enclosing objects! > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ closure (org.apache.spark.sql.execution.python.EvaluatePython$$anonfun$javaToPython$1) is now cleaned +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ Cleaning closure (org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13) +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared fields: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public static final long org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.serialVersionUID > 16/05/16 12:17:47 DEBUG ClosureCleaner: private final org.apache.spark.rdd.RDD$$anonfun$collect$1 org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.$outer > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared methods: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(java.lang.Object) > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(scala.collection.Iterator) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + inner classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer classes: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: org.apache.spark.rdd.RDD$$anonfun$collect$1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: org.apache.spark.rdd.RDD > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer objects: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: > 16/05/16 12:17:47 DEBUG ClosureCleaner: MapPartitionsRDD[5] at collect at :1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure > 16/05/16 12:17:47 DEBUG ClosureCleaner: + fields accessed by starting closure: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: (class org.apache.spark.rdd.RDD$$anonfun$collect$1,Set($outer)) > 16/05/16 12:17:47 DEBUG ClosureCleaner: (class org.apache.spark.rdd.RDD,Set(org$apache$spark$rdd$RDD$$evidence$1)) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outermost object is not a closure or REPL line object, so do not clone it: (class org.apache.spark.rdd.RDD,MapPartitionsRDD[5] at collect at :1) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + cloning the object of class org.apache.spark.rdd.RDD$$anonfun$collect$1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + cleaning cloned closure recursively (org.apache.spark.rdd.RDD$$anonfun$collect$1) > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ Cleaning closure (org.apache.spark.rdd.RDD$$anonfun$collect$1) +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared fields: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public static final long org.apache.spark.rdd.RDD$$anonfun$collect$1.serialVersionUID > 16/05/16 12:17:47 DEBUG ClosureCleaner: private final org.apache.spark.rdd.RDD org.apache.spark.rdd.RDD$$anonfun$collect$1.$outer > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared methods: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public org.apache.spark.rdd.RDD org.apache.spark.rdd.RDD$$anonfun$collect$1.org$apache$spark$rdd$RDD$$anonfun$$$outer() > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.rdd.RDD$$anonfun$collect$1.apply() > 16/05/16 12:17:47 DEBUG ClosureCleaner: + inner classes: 1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer classes: 1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: org.apache.spark.rdd.RDD > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer objects: 1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: MapPartitionsRDD[5] at collect at :1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + fields accessed by starting closure: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: (class org.apache.spark.rdd.RDD$$anonfun$collect$1,Set($outer)) > 16/05/16 12:17:47 DEBUG ClosureCleaner: (class org.apache.spark.rdd.RDD,Set(org$apache$spark$rdd$RDD$$evidence$1)) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outermost object is not a closure or REPL line object, so do not clone it: (class org.apache.spark.rdd.RDD,MapPartitionsRDD[5] at collect at :1) > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ closure (org.apache.spark.rdd.RDD$$anonfun$collect$1) is now cleaned +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ closure (org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13) is now cleaned +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ Cleaning closure (org.apache.spark.SparkContext$$anonfun$runJob$5) +++ > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared fields: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$runJob$5.serialVersionUID > 16/05/16 12:17:47 DEBUG ClosureCleaner: private final scala.Function1 org.apache.spark.SparkContext$$anonfun$runJob$5.cleanedFunc$1 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + declared methods: 2 > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$runJob$5.apply(java.lang.Object,java.lang.Object) > 16/05/16 12:17:47 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$runJob$5.apply(org.apache.spark.TaskContext,scala.collection.Iterator) > 16/05/16 12:17:47 DEBUG ClosureCleaner: + inner classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer classes: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + outer objects: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure > 16/05/16 12:17:47 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 > 16/05/16 12:17:47 DEBUG ClosureCleaner: + there are no enclosing objects! > 16/05/16 12:17:47 DEBUG ClosureCleaner: +++ closure (org.apache.spark.SparkContext$$anonfun$runJob$5) is now cleaned +++ > 16/05/16 12:17:47 INFO SparkContext: Starting job: collect at :1 > 16/05/16 12:17:47 INFO DAGScheduler: Got job 1 (collect at :1) with 1 output partitions > 16/05/16 12:17:47 INFO DAGScheduler: Final stage: ResultStage 1 (collect at :1) > 16/05/16 12:17:47 INFO DAGScheduler: Parents of final stage: List() > 16/05/16 12:17:47 INFO DAGScheduler: Missing parents: List() > 16/05/16 12:17:47 DEBUG DAGScheduler: submitStage(ResultStage 1) > 16/05/16 12:17:47 DEBUG DAGScheduler: missing: List() > 16/05/16 12:17:47 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at collect at :1), which has no missing parents > 16/05/16 12:17:47 DEBUG DAGScheduler: submitMissingTasks(ResultStage 1) > 16/05/16 12:17:47 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.1 KB, free 5.8 GB) > 16/05/16 12:17:47 DEBUG BlockManager: Put block broadcast_1 locally took 1 ms > 16/05/16 12:17:47 DEBUG BlockManager: Putting block broadcast_1 without replication took 1 ms > 16/05/16 12:17:47 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1856.0 B, free 5.8 GB) > 16/05/16 12:17:47 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 188.165.13.157:35738 (size: 1856.0 B, free: 5.8 GB) > 16/05/16 12:17:47 DEBUG BlockManagerMaster: Updated info of block broadcast_1_piece0 > 16/05/16 12:17:47 DEBUG BlockManager: Told master about block broadcast_1_piece0 > 16/05/16 12:17:47 DEBUG BlockManager: Put block broadcast_1_piece0 locally took 1 ms > 16/05/16 12:17:47 DEBUG BlockManager: Putting block broadcast_1_piece0 without replication took 2 ms > 16/05/16 12:17:47 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1012 > 16/05/16 12:17:47 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at collect at :1) > 16/05/16 12:17:47 DEBUG DAGScheduler: New pending partitions: Set(0) > 16/05/16 12:17:47 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks > 16/05/16 12:17:47 DEBUG TaskSetManager: Epoch for TaskSet 1.0: 0 > 16/05/16 12:17:47 DEBUG TaskSetManager: Valid locality levels for TaskSet 1.0: NO_PREF, ANY > 16/05/16 12:17:47 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 0 > 16/05/16 12:17:47 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, xxx3, partition 0, PROCESS_LOCAL, 5542 bytes) > 16/05/16 12:17:47 DEBUG TaskSetManager: No tasks for locality level NO_PREF, so moving to locality level ANY > 16/05/16 12:17:47 INFO SparkDeploySchedulerBackend: Launching task 1 on executor id: 0 hostname: xxx3. > 16/05/16 12:17:48 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 1 > 16/05/16 12:17:48 DEBUG BlockManager: Getting local block broadcast_1_piece0 as bytes > 16/05/16 12:17:48 DEBUG BlockManager: Level for block broadcast_1_piece0 is StorageLevel(disk=true, memory=true, offheap=false, deserialized=false, replication=1) > 16/05/16 12:17:48 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 188.165.13.158:53616 (size: 1856.0 B, free: 14.8 GB) > 16/05/16 12:17:49 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 1 > 16/05/16 12:17:50 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 1 > 16/05/16 12:17:50 DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1, runningTasks: 0 > 16/05/16 12:17:50 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 2156 ms on xxx3 (1/1) > 16/05/16 12:17:50 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool > 16/05/16 12:17:50 INFO DAGScheduler: ResultStage 1 (collect at :1) finished in 2.158 s > 16/05/16 12:17:50 DEBUG DAGScheduler: After removal of stage 1, remaining stages = 0 > 16/05/16 12:17:50 INFO DAGScheduler: Job 1 finished: collect at :1, took 2.174808 s > {code} > I can't see any information on Hive connection in this trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org