drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-5510) Revisit connection failure recovery in Hive storage plugin
Date Sun, 14 May 2017 20:53:04 GMT
Paul Rogers created DRILL-5510:

             Summary: Revisit connection failure recovery in Hive storage plugin
                 Key: DRILL-5510
                 URL: https://issues.apache.org/jira/browse/DRILL-5510
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.11.0
            Reporter: Paul Rogers

DRILL-5496 describes a problem which occurs when the Hive metastore server is restarted while
Drill runs. The solution in that ticket is a work-around: we discard all cached Hive metastore
data and rebuild the metadata cache.

The original code tried to be more subtle: detecting that the connection has failed, reconnect,
but preserve the cache. DRILL-5496 describes the flaws in that approach for the secure connection

This ticket asks to spend the time to understand the Hive metadata code and restructure it
to preserve the cache across connection failures.

Note a subtle issue: if the Hive metastore goes down, when it comes back up, it may contain
different data; anything could happen while the server is down: upgrade schemas, replace one
schema with another, etc. So, the caching mechanism, if it is to preserve data across reconnects,
must handle such changes.

Of course, such changes could occur even within a single connection, so the code should handle
such cases already.

This message was sent by Atlassian JIRA

View raw message