hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-584) Clean up global and ThreadLocal variables in Hive
Date Fri, 11 Sep 2009 18:55:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754280#action_12754280
] 

Ning Zhang commented on HIVE-584:
---------------------------------

Just had an offline conversation with Zheng, trying to understand the concepts:

Life time:
  - Session is the longest living object. It corresponds to the life time from user connecting
to Hive using Hive CLI or Hive Server till closing it.
  - The DB connection life time is the time that Hive connects to the metastore. Currently
its life time is always the same as that of the session. But going forward, it could be different
since a session could be connecting to different metastores (by providing a different HiveConf).
This is useful if there are multiple instances of Hadoop clusters, each of which maintain
a different Hive DB.

> Clean up global and ThreadLocal variables in Hive
> -------------------------------------------------
>
>                 Key: HIVE-584
>                 URL: https://issues.apache.org/jira/browse/HIVE-584
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: hive-584-2009-9-10.patch, hive-584-2009-9-11.patch
>
>
> Currently in Hive code there are several global and ThreadLocal variables that need to
be cleaned.
> Specifically, the following classes are involved:
> 1. HiveConf: contains hive configurations (and a classloader)
> 2. Hive class: contains a static member Hive db. Hive class contains a member HiveConf
conf, as well as a ThreadLocal storage of IMetaStoreClient.
> 3. SessionState: contains a static ThreadLocal storage of SessionState. SessionState
class contains a Hive db, a HiveConf conf, a history logger, and a bunch of standard input/output
streams
> 4. CliSessionState: SessionState plus some command options and the command file name.
> 5. All classes that try to get Hive db or HiveConf from global static Hive db, or SessionState.
> There are several problems with the current design. To name a few:
> 1. SessionState instances are ThreadLocal, but SessionState contains Hive db which also
contains ThreadLocal storage. Not sure a db can be shared across different threads or not?
What is the global static Hive db?
> 2. We pass HiveConf and Hive db in two ways to classes like Task: Sometimes through initialize(),
sometimes through SessionState. This complicates the code a lot. It's hard to know which HiveConf
and which db we should use.
> We need to think about a better way to do it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message