drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5496) Must restart drillbits whenever a secure Hive metastore is restarted
Date Fri, 12 May 2017 18:06:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008494#comment-16008494
] 

Paul Rogers commented on DRILL-5496:
------------------------------------

As it turns out, the Hive client in the Hive storage plugin is not designed to handle security.

* When we start the Hive storage plugin, we create a single instance of the {{HiveSchemaFactory}}.
* {{HiveSchemaFactory}} holds on to a {{DrillHiveMetaStoreClient}} connection. In the secure
case, this connection is used to get security certificates for us in creating secure connections.
* {{HiveSchemaFactory}} has a Guava loading cache of user-specific, secure connections.

When the Hive metastore goes down, all connections become invalid including the non-secure
and all the secure connections. But, we try to handle the problem as follows.

If a secure connection times out:

* Use the (now-invalid) insecure connection to get another ticket. But, since this isn't valid,
we can't reconnect and so always fail.

If we try to use a cached secure connection before timeout, then this happens:

* Try to send a message.
* When that fails, try to reconnect (using the old certificate for the prior session.)
* When that fails, give up.

What we really need to do is:

* Recreate both the insecure *and* secure connections.

But, since the secure connection cache is held on the insecure connection, we can't easily
recreate that connection: we'd get a new object.

So, we have to make some changes.

* Hold the secure connection cache on an object other than a connection.
* Use a connection proxy instead of the connection as key to the cache. The proxy allows maintaining
the cache entry, but replacing the secure connection with a new one. (The proxy is just a
wrapper around a replacable secure connection.)
* Similarly, provide a thread-safe way to reconnect the non-secure connection used to get
tickets for the secure connection.

All this is not a huge project, but it is more than can be done in the context of a quick
fix for this ticket. So, for this ticket I used a bit of a  hack: just throw away the entire
schema builder and create a new one. But, that solution requires synchronizing all requests
and is far from ideal.

> Must restart drillbits whenever a secure Hive metastore is restarted
> --------------------------------------------------------------------
>
>                 Key: DRILL-5496
>                 URL: https://issues.apache.org/jira/browse/DRILL-5496
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-4964: "Drill fails to connect to hive metastore after hive metastore is restarted
unless drillbits are restarted also" attempted to fix a bug in Drill in which Drill hangs
if Hive is restarted. Now, we see that all subsequent "show schemas" queries fail.
> Steps to repro:
> 1. Build a secure cluster (we used MapR)
> 2. Install Hive and Drill services
> 3. Configure drill impersonation and authentication
> 4. Restart hivemeta service
> 5. Connect to drill and execute query involving hive storage, issue occurs
> 6. Restart the drill-bits services and execute the query, issue is no longer hit
> The problem occurs in the same place as the earlier fix, but might represent a slightly
different use case: in this case the connection is secure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message