hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-12170) normalize HBase metastore connection configuration
Date Thu, 15 Oct 2015 19:31:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959478#comment-14959478
] 

Sergey Shelukhin edited comment on HIVE-12170 at 10/15/15 7:30 PM:
-------------------------------------------------------------------

Will look at the patch after lunch :) 
The problems arise only with embedded metastore (including embedded in HS2). For the test,
it appears that the different configs might have been caused by the issue fixed in HIVE-12062,
where config in testing util is reset after HBase minicluster config is set, so all subsequent
code uses a different config.
Another scenario is for embedded metastore usage for any service that gets different configs,
like Tez AM. Tez AM should not rely on default config to create metastore and should instead
rely on config of the query; I had problems with that before due to some static call to metastore
where Tez AM would create ObjectStore even though it was configured later to connect to remote
metastore via a query config. 
For HS2, I don't know if we support connecting to multiple metastores. However, accessing
embedded metastore from multiple threads may cause a thread safety problem.
Also a static like that seems pretty brittle in an abstract sense, and the API get(conf) is
misleading, because it will return the instance with potentially different conf, and only
set up the conf for the next call. 

If we assume the same conf perhaps we should not reset staticConf if already set, and should
throw if it's a different conf



was (Author: sershe):
Will look at the patch after lunch :) 
The problems arise only with embedded metastore (including embedded in HS2). For the test,
it appears that the different configs might have been caused by the issue fixed in HIVE-12062,
where config in testing util is reset after HBase minicluster config is set, so all subsequent
code uses a different config.
Another scenario is for embedded metastore usage for any service that gets different configs,
like Tez AM. Tez AM should not rely on default config to create metastore and should instead
rely on config of the query; I had problems with that before due to some static call to metastore
where Tez AM would create ObjectStore even though it was configured later to connect to remote
metastore via a query config. 
For HS2, I don't know if we support connecting to multiple metastores. However, accessing
embedded metastore from multiple threads may cause a thread safety problem.
Also a static like that seems pretty brittle in an abstract sense, and the API get(conf) is
misleading, because it will return the instance with potentially different conf, and only
set up the conf for the next call. 


> normalize HBase metastore connection configuration
> --------------------------------------------------
>
>                 Key: HIVE-12170
>                 URL: https://issues.apache.org/jira/browse/HIVE-12170
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12170.patch
>
>
> Right now there are two ways to get HBaseReadWrite instance in metastore. Both get a
threadlocal instance (is there a good reason for that?).
> 1) One is w/o conf and only works if someone called the (2) before, from any thread.
> 2) The other blindly sets a static conf and then gets an instance with that conf, or
if someone already happened to call (1) or (2) from this thread, it returns the existing instance
with whatever conf was set before (but still resets the current conf to new conf).
> This doesn't make sense even in an already-thread-safe case (like linear CLI-based tests),
and can easily lead to bugs as described; the config propagation logic is not good (example
- HIVE-12167); some calls just reset config blindly, so there's no point in setting staticConf,
other than for the callers of method (1) above who don't have a conf and would rely on the
static (which is bad design).
> Having connections with different configs reliably in not possible, and multi-threaded
cases would also break - you could even set conf, have it reset and get instance with somebody
else's conf. 
> Static should definitely be removed, maybe threadlocal too (HConnection is thread-safe).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message