drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5496) Must restart drillbits whenever a secure Hive metastore is restarted
Date Mon, 15 May 2017 18:33:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011063#comment-16011063
] 

ASF GitHub Bot commented on DRILL-5496:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/833#discussion_r116559073
  
    --- Diff: contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java
---
    @@ -95,8 +99,63 @@ public HiveScan getPhysicalScan(String userName, JSONOptions selection,
List<Sch
         }
       }
     
    +  // Forced to synchronize this method to allow error recovery
    +  // in the multi-threaded case. Can remove synchronized only
    +  // by restructuring connections and cache to allow better
    +  // recovery from failed secure connections.
    +
       @Override
    -  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) throws IOException
{
    +  public synchronized void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent)
throws IOException {
    +    try {
    +      schemaFactory.registerSchemas(schemaConfig, parent);
    +      return;
    +
    +    // Hack. We may need to retry the connection. But, we can't because
    +    // the retry logic is implemented in the very connection we need to
    +    // discard and rebuild. To work around, we discard the entire schema
    +    // factory, and all its invalid connections. Very crude, but the
    +    // easiest short-term solution until we refactor the code to do the
    +    // job properly. See DRILL-5510.
    +
    +    } catch (Throwable e) {
    +      // Unwrap exception
    +      Throwable ex = e;
    +      for (;;) {
    +        // Case for failing on an invalid cached connection
    +        if (ex instanceof MetaException ||
    +            // Case for a timed-out impersonated connection, and
    +            // an invalid non-secure connection used to get security
    +            // tokens.
    +            ex instanceof TTransportException) {
    +          break;
    +        }
    +
    +        // All other exceptions are not handled, just pass along up
    +        // the stack.
    +
    +        if (ex.getCause() == null  ||  ex.getCause() == ex) {
    +          throw new DrillRuntimeException( "Unknown Hive error", e );
    +        }
    +        ex = ex.getCause();
    +      }
    +    }
    +
    +    // Build a new factory which will cause an all new set of
    +    // Hive metastore connections to be created.
    +
    +    try {
    +      schemaFactory.close();
    +    } catch (Throwable t) {
    +      // Ignore, we're in a bad state.
    --- End diff --
    
    Fixed.


> Must restart drillbits whenever a secure Hive metastore is restarted
> --------------------------------------------------------------------
>
>                 Key: DRILL-5496
>                 URL: https://issues.apache.org/jira/browse/DRILL-5496
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-4964: "Drill fails to connect to hive metastore after hive metastore is restarted
unless drillbits are restarted also" attempted to fix a bug in Drill in which Drill hangs
if Hive is restarted. Now, we see that all subsequent "show schemas" queries fail.
> Steps to repro:
> 1. Build a secure cluster (we used MapR)
> 2. Install Hive and Drill services
> 3. Configure drill impersonation and authentication
> 4. Restart hivemeta service
> 5. Connect to drill and execute query involving hive storage, issue occurs
> 6. Restart the drill-bits services and execute the query, issue is no longer hit
> The problem occurs in the same place as the earlier fix, but might represent a slightly
different use case: in this case the connection is secure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message