spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From liancheng <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-4037][SQL] Removes the SessionState ins...
Date Tue, 28 Oct 2014 17:09:07 GMT
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2887#discussion_r19486333
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala ---
    @@ -288,8 +296,15 @@ class HiveContext(sc: SparkContext) extends SQLContext(sc) {
           val cmd_1: String = cmd_trimmed.substring(tokens(0).length()).trim()
           val proc: CommandProcessor = HiveShim.getCommandProcessor(Array(tokens(0)), hiveconf)
     
    +      // Makes sure the session represented by the `sessionState` field is activated.
This implies
    +      // Spark SQL Hive support uses a single `SessionState` for all Hive operations
and breaks
    +      // session isolation under multi-user scenarios (i.e. HiveThriftServer2).
    +      // TODO Fix session isolation
    +      SessionState.start(sessionState)
    --- End diff --
    
    Hm, `sessionState` do gets started within the same thread multiple times. However, at
least for Hive 0.12.0, I think `SessionState.start(sessionState)` should be an idempotent
operation, calling it multiple times shouldn't hurt. Maybe this doesn't hold anymore for Hive
0.13.1.
    
    Another issue that really puzzles me is this line from the [Jenkins test failure log](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/22361/testReport/org.apache.spark.sql.hive.execution/HiveCompatibilitySuite/groupby1/):
    
    ```
    	at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:342)
    ```
    
    The line number suggests that apparently Hive 0.13.1 was used (see [here](https://github.com/apache/hive/blob/release-0.13.1/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L342)),
but this Jenkins build was triggered with
    
    ```
    -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-0.12.0
    ```
    
    Anyway, I'll follow your suggestion to avoid starting `sessionState` multiple times and
try again. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message