hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <>
Subject [jira] [Created] (HIVE-15722) LLAP: Avoid marking a query as complete if the AMReporter runs into an error
Date Wed, 25 Jan 2017 07:35:26 GMT
Siddharth Seth created HIVE-15722:

             Summary: LLAP: Avoid marking a query as complete if the AMReporter runs into
an error
                 Key: HIVE-15722
             Project: Hive
          Issue Type: Bug
            Reporter: Siddharth Seth
            Assignee: Siddharth Seth

When the AMReporter runs into an error (typically intermittent), we end up killing all fragments
on the daemon. This is done by marking the query as complete.
The AM would continue to try scheduling on this node - which would lead to task failures if
the daemon structures are updated.

Instead of clearing the structures, it's better to kill the fragments, and let a queryComplete
call come in from the AM.

Later, we could make enhancements in the AM to avoid such nodes. That's not simple though,
since the AM will not find out what happened due to the communication failure from the daemon.

Leads to 
org.apache.hadoop.ipc.RemoteException(java.lang.RuntimeException): Dag query16 already complete.
Rejecting fragment [Map 7, 29, 0]
	at org.apache.hadoop.hive.llap.daemon.impl.QueryTracker.registerFragment(
	at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(
	at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(
	at org.apache.hadoop.hive.llap.daemon.impl.LlapProtocolServerImpl.submitWork(
	at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
	at org.apache.hadoop.ipc.RPC$
	at org.apache.hadoop.ipc.Server$Handler$
	at org.apache.hadoop.ipc.Server$Handler$
	at Method)
	at org.apache.hadoop.ipc.Server$

This message was sent by Atlassian JIRA

View raw message