drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5496) Must restart drillbits whenever a secure Hive metastore is restarted
Date Tue, 09 May 2017 20:46:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003524#comment-16003524
] 

Paul Rogers commented on DRILL-5496:
------------------------------------

Full stack trace at failure:

{code}
2017-05-01 16:03:00,232 [26f86b8b-c25f-4593-99b6-03f1d927aeee:foreman] WARN  o.a.d.e.s.h.DrillHiveMetaStoreClient
- Failure while attempting to get hive databases. Retries once.
org.apache.hadoop.hive.metastore.api.MetaException: Got exception: org.apache.thrift.transport.TTransportException
null
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1213)
~[hive-metastore-1.2.0-mapr-1608.jar:1.2.0-mapr-1608]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1033)
~[hive-metastore-1.2.0-mapr-1608.jar:1.2.0-mapr-1608]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient.getDatabasesHelper(DrillHiveMetaStoreClient.java:203)
~[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$DatabaseLoader.load(DrillHiveMetaStoreClient.java:505)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$DatabaseLoader.load(DrillHiveMetaStoreClient.java:498)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
[guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
[guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache.get(LocalCache.java:3937) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) [guava-18.0.jar:na]
        at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)
[guava-18.0.jar:na]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$HiveClientWithAuthzWithCaching.getDatabases(DrillHiveMetaStoreClient.java:411)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getSubSchema(HiveSchemaFactory.java:139)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.<init>(HiveSchemaFactory.java:133)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.registerSchemas(HiveSchemaFactory.java:118)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.hive.HiveStoragePlugin.registerSchemas(HiveStoragePlugin.java:100)
[drill-storage-hive-core-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.StoragePluginRegistryImpl$DrillSchemaFactory.registerSchemas(StoragePluginRegistryImpl.java:396)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.SchemaTreeProvider.createRootSchema(SchemaTreeProvider.java:110)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.store.SchemaTreeProvider.createRootSchema(SchemaTreeProvider.java:99)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.ops.QueryContext.getRootSchema(QueryContext.java:163) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.ops.QueryContext.getRootSchema(QueryContext.java:152) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.ops.QueryContext.getNewDefaultSchema(QueryContext.java:138)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.planner.sql.SqlConverter.<init>(SqlConverter.java:110)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:101)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79)
[drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050) [drill-java-exec-1.10.0.jar:1.10.0]
        at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281) [drill-java-exec-1.10.0.jar:1.10.0]
{code}

As it turns out, the code already attempts to retry the connection (added by DRILL-4964):

{code}
  protected static List<String> getDatabasesHelper(final IMetaStoreClient mClient) throws
TException {
    try {
      return mClient.getAllDatabases();
    } catch (MetaException e) {
      /*
         HiveMetaStoreClient is encapsulating both the MetaException/TExceptions inside MetaException.
         Since we don't have good way to differentiate, we will close older connection and
retry once.
         This is only applicable for getAllTables and getAllDatabases method since other methods
are
         properly throwing correct exceptions.
      */
      logger.warn("Failure while attempting to get hive databases. Retries once.", e);
      try {
        mClient.close();
      } catch (Exception ex) {
        logger.warn("Failure while attempting to close existing hive metastore connection.
May leak connection.", ex);
      }
      mClient.reconnect();
      return mClient.getAllDatabases();
    }
  }
{code}

The log says:

{code}
WARN  o.a.d.e.s.h.DrillHiveMetaStoreClient - Failure while attempting to get hive databases.
Retries once.
{code}

So, we got as far as the line that emits the logger line. That is, we caught the exception
on the invalid connection and we attempted to retry.

But, the log says:

{code}
DrillHiveMetaStoreClient.getDatabasesHelper(DrillHiveMetaStoreClient.java:203)
{code}

[Line 203|https://github.com/apache/drill/blob/b657d44feb527c8e3d83c9996c9220ec4d50aaf3/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/DrillHiveMetaStoreClient.java]
is the first call to {{mClient.getAllDatabases()}}. This suggests that the retry was not actually
done.

Consider the code snippet shown earlier. Stepping through the failure scenario shows that
the following line fails:

{code}
      mClient.reconnect();
{code}

Evidently this retry code does not work for a secure connection.

> Must restart drillbits whenever a secure Hive metastore is restarted
> --------------------------------------------------------------------
>
>                 Key: DRILL-5496
>                 URL: https://issues.apache.org/jira/browse/DRILL-5496
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.11.0
>
>
> DRILL-4964: "Drill fails to connect to hive metastore after hive metastore is restarted
unless drillbits are restarted also" attempted to fix a bug in Drill in which Drill hangs
if Hive is restarted. Now, we see that all subsequent "show schemas" queries fail.
> Steps to repro:
> 1. Build a secure cluster (we used MapR)
> 2. Install Hive and Drill services
> 3. Configure drill impersonation and authentication
> 4. Restart hivemeta service
> 5. Connect to drill and execute query involving hive storage, issue occurs
> 6. Restart the drill-bits services and execute the query, issue is no longer hit
> The problem occurs in the same place as the earlier fix, but might represent a slightly
different use case: in this case the connection is secure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message