hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Williams (JIRA)" <>
Subject [jira] [Created] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures
Date Mon, 20 Apr 2015 23:35:58 GMT
Richard Williams created HIVE-10410:

             Summary: Apparent race condition in HiveServer2 causing intermittent query failures
                 Key: HIVE-10410
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
    Affects Versions: 0.13.1
         Environment: CDH 5.3.3
CentOS 6.5
            Reporter: Richard Williams

On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC occasionally trigger
odd Thrift exceptions with messages such as "Read a negative frame size (-2147418110)!" or
"out of sequence response" in HiveServer2's connections to the metastore. For certain metastore
calls (for example, showDatabases), these Thrift exceptions are converted to MetaExceptions
in HiveMetaStoreClient, which prevents RetryingMetaStoreClient from retrying these calls and
thus causes the failure to bubble out to the JDBC client.

Note that as far as we can tell, this issue appears to only affect queries that are submitted
with the runAsync flag on TExecuteStatementReq set to true (which, in practice, seems to mean
all JDBC queries), and it appears to only manifest when HiveServer2 is using the new HTTP
transport mechanism. When both these conditions hold, we are able to fairly reliably reproduce
the issue by spawning about 100 simple, concurrent hive queries (we have been using "show
databases"), two or three of which typically fail. However, when either of these conditions
do not hold, we are no longer able to reproduce the issue.

Some example stack traces from the HiveServer2 logs:

This message was sent by Atlassian JIRA

View raw message