hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Sun (JIRA)" <>
Subject [jira] [Commented] (HIVE-14524) BaseSemanticAnalyzer may leak HMS connection
Date Fri, 12 Aug 2016 04:30:22 GMT


Chao Sun commented on HIVE-14524:

OK, did some debugging. Here's how the above step 4) carries out:
1. In the HS2 handler thread, when calling {{Driver#compile}}, a {{BaseSemanticAnalyzer}}
is initialized. Since the config is changed,
the old {{Hive}} instance is replaced with a new one. Let's call the old one *A* and the new
one *B*. the {{BaseSemanticAnalyzer}} instance
is initialized with *B*.
2. immediately following the above code, a HMS connection is created for *B* for flushing
metastore cache.
3. the handler thread launches a background thread for query execution, which will first set
the thread-local {{Hive}} instance using the handler's {{parentHive}} field, *which still
refers to A*. So now *B* is overwritten by *A*!. There's no variable refers to *B*, and *B*
holds a open HMS connection...
4. the background thread executes the query, and opens new HMS connections.
5. after the background thread is done, the handler thread will then set the thread-local
{{Hive}} instance with {{sessionHive}}, which also points to *A*. So now the handler thread
is using *A* again.

As result, the copy *B* is permanently lost, along with the connection.

> BaseSemanticAnalyzer may leak HMS connection
> --------------------------------------------
>                 Key: HIVE-14524
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 2.2.0
>            Reporter: Chao Sun
>            Assignee: Chao Sun
> Currently {{BaseSemanticAnalyzer}} keeps a copy of thread-local {{Hive}} object to connect
to HMS. However, in some cases Hive may overwrite the existing {{Hive}} object:
> {{Hive#getInternal}}:
> {code}
>   private static Hive getInternal(HiveConf c, boolean needsRefresh, boolean isFastCheck,
>       boolean doRegisterAllFns) throws HiveException {
>     Hive db = hiveDB.get();
>     if (db == null || !db.isCurrentUserOwner() || needsRefresh
>         || (c != null && db.metaStoreClient != null && !isCompatible(db,
c, isFastCheck))) {
>       return create(c, false, db, doRegisterAllFns);
>     }
>     if (c != null) {
>       db.conf = c;
>     }
>     return db;
>   }
> {code}
> *This poses an potential problem*: if one first instantiates a {{BaseSemanticAnalyzer}}
object with the current {{Hive}} object (let's call it A), and for some reason A is overwritten
by B with the code above, then {{BaseSemanticAnalyzer}} may keep using A to contact HMS, which
will leak connections.
> This can be reproduced by the following steps:
> 1. open a session
> 2. execute some simple query such as {{desc formatted src}}
> 3. change a metastore property (I know, this is not a perfect example...), for instance:
{{set hive.txn.timeout=500}}
> 4. run another command such as {{desc formatted src}} again
> Notice that in step 4), since a metavar is changed the {{isCompatible}} will return false,
and hence a new {{Hive}} object is created. As result, you'll observe in the HS2 log that
an connection has been leaked.

This message was sent by Atlassian JIRA

View raw message