hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasad Chakka (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-584) Clean up global and ThreadLocal variables in Hive
Date Fri, 26 Jun 2009 17:57:47 GMT

    [ https://issues.apache.org/jira/browse/HIVE-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724639#action_12724639

Prasad Chakka commented on HIVE-584:

db for Hive.java objects is a misnomer. If we are cleaning up code then we should rename it.
All it does is do metadata access.

Hive.java can be shared across multiple threads as long as they share same HiveConf. But that
is useless since we do want multiple threads to have separate HiveConfs.

We should make Hive.db, Hive.conf, Hive.metastoreClient all non-static, non-threadlocal variables
and make Hive.java itself a thread local variable. We nedd Hive.java object independent of
SessionState or CliSessionState in lot o code where these objects don't exist. Passing conf
around on all of the calls makes code cumbersome.
public class Hive {

  private HiveConf conf;
  private IMetaStoreClient msc;

   * creates and returns Hive object representing metadata for this thread. if a Hive object
already exists for this thread, the passed HiveConf
   * is compared with the HiveConf stored in Hive object. If any parameters have changed then
a new Hive object is created and returned.
  public static Hive get(HiveConf c) throws HiveException {}
   * similar to get(HiveConf) but a new HiveConf object
  public static Hive get() throws HiveException {}

public class SessionState {
   public static getHive(HiveConf conf) throws HiveException {
       return Hive.getHive(conf);
   public static getDb() throws HiveException {
       return Hive.getHive();
   public HiveConf getConf() {
       return getDb().getConf();

   private SessionState(HiveConf conf) {

   /** returns thread local SessionState and creates it on first call */
   public static get(HiveConf conf);
   public static get();
SessionState also refers to this thread local Hive object instead of storing it as a class

Would this work?

> Clean up global and ThreadLocal variables in Hive
> -------------------------------------------------
>                 Key: HIVE-584
>                 URL: https://issues.apache.org/jira/browse/HIVE-584
>             Project: Hadoop Hive
>          Issue Type: Improvement
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
> Currently in Hive code there are several global and ThreadLocal variables that need to
be cleaned.
> Specifically, the following classes are involved:
> 1. HiveConf: contains hive configurations (and a classloader)
> 2. Hive class: contains a static member Hive db. Hive class contains a member HiveConf
conf, as well as a ThreadLocal storage of IMetaStoreClient.
> 3. SessionState: contains a static ThreadLocal storage of SessionState. SessionState
class contains a Hive db, a HiveConf conf, a history logger, and a bunch of standard input/output
> 4. CliSessionState: SessionState plus some command options and the command file name.
> 5. All classes that try to get Hive db or HiveConf from global static Hive db, or SessionState.
> There are several problems with the current design. To name a few:
> 1. SessionState instances are ThreadLocal, but SessionState contains Hive db which also
contains ThreadLocal storage. Not sure a db can be shared across different threads or not?
What is the global static Hive db?
> 2. We pass HiveConf and Hive db in two ways to classes like Task: Sometimes through initialize(),
sometimes through SessionState. This complicates the code a lot. It's hard to know which HiveConf
and which db we should use.
> We need to think about a better way to do it.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message