spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hu Liu, (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
Date Tue, 05 Sep 2017 06:29:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hu Liu, updated SPARK-21918:
----------------------------
    Description: 
I'm testing the spark thrift server and found that all the DDL statements are run by user
hive even if hive.server2.enable.doAs=true
The root cause is that Hive object is shared between different thread in HiveClientImpl
{code:java}
  private def client: Hive = {
    if (clientLoader.cachedHive != null) {
      clientLoader.cachedHive.asInstanceOf[Hive]
    } else {
      val c = Hive.get(conf)
      clientLoader.cachedHive = c
      c
    }
  }
{code}
But in impersonation mode, we should just share the Hive object inside the thread so that
the  metastore client in Hive could be associated with right user.

we can  pass the Hive object of parent thread to child thread when running the sql to fix
it
I have already had a initial patch for review and I'm glad to work on it if anyone could assign
it to me.


  was:
I'm testing the spark thrift server and found that all the DDL statements are run by user
hive even if hive.server2.enable.doAs=true
The root cause is that Hive object is shared between different thread in HiveClientImpl
{code:java}
  private def client: Hive = {
    if (clientLoader.cachedHive != null) {
      clientLoader.cachedHive.asInstanceOf[Hive]
    } else {
      val c = Hive.get(conf)
      clientLoader.cachedHive = c
      c
    }
  }
{code}
But in impersonation mode, we should just share the Hive object inside the thread.

we can  pass the Hive object of current thread to new thread when running the sql to fix it
I have already had a initial patch for review and I'm glad to work on it if anyone could assign
it to me.



> HiveClient shouldn't share Hive object between different thread
> ---------------------------------------------------------------
>
>                 Key: SPARK-21918
>                 URL: https://issues.apache.org/jira/browse/SPARK-21918
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Hu Liu,
>
> I'm testing the spark thrift server and found that all the DDL statements are run by
user hive even if hive.server2.enable.doAs=true
> The root cause is that Hive object is shared between different thread in HiveClientImpl
> {code:java}
>   private def client: Hive = {
>     if (clientLoader.cachedHive != null) {
>       clientLoader.cachedHive.asInstanceOf[Hive]
>     } else {
>       val c = Hive.get(conf)
>       clientLoader.cachedHive = c
>       c
>     }
>   }
> {code}
> But in impersonation mode, we should just share the Hive object inside the thread so
that the  metastore client in Hive could be associated with right user.
> we can  pass the Hive object of parent thread to child thread when running the sql to
fix it
> I have already had a initial patch for review and I'm glad to work on it if anyone could
assign it to me.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message