tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajesh Balamohan (Jira)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-4273) Clear off staging files when TezYarnClient is unable to submit applications
Date Thu, 28 Jan 2021 02:30:00 GMT

    [ https://issues.apache.org/jira/browse/TEZ-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273262#comment-17273262
] 

Rajesh Balamohan commented on TEZ-4273:
---------------------------------------

Lines of interest:
 https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/client/TezClient.java#L403
 https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/client/TezClient.java#L408
 https://github.com/apache/tez/blob/master/tez-api/src/main/java/org/apache/tez/client/TezClientUtils.java#L470

When exception happens in submitApplication, TezClient should clear off staging directory.

> Clear off staging files when TezYarnClient is unable to submit applications
> ---------------------------------------------------------------------------
>
>                 Key: TEZ-4273
>                 URL: https://issues.apache.org/jira/browse/TEZ-4273
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Priority: Major
>
> Currently it leaves behind few resources like "tez-conf.pb" etc when exception is encountered
during app submission. This causes issues in cluster, when apps continue to submit queries
continuously.
> {noformat}
> drwx------   - hive supergroup          0 2021-01-28 01:58 /tmp/hive/hive/_tez_session_dir/806cd302-abd0-4694-be4a-b2b2473e75a8/.tez/application_1611791897439_0042
> -rw-r--r--   3 hive supergroup     135519 2021-01-28 01:58 /tmp/hive/hive/_tez_session_dir/806cd302-abd0-4694-be4a-b2b2473e75a8/.tez/application_1611791897439_0042/tez-conf.pb
> -rw-r--r--   3 hive supergroup       1056 2021-01-28 01:58 /tmp/hive/hive/_tez_session_dir/806cd302-abd0-4694-be4a-b2b2473e75a8/.tez/application_1611791897439_0042/tez.session.local-resources.pb

> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1611791897439_0042
to YARN : org.apache.hadoop.security.AccessControlException: Queue root.default already has
X applications from user hive cannot accept submission of application: application_1611791897439_0042
> 	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:322)
~[hadoop-yarn-client-3.x:?]
> 	at org.apache.tez.client.TezYarnClient.submitApplication(TezYarnClient.java:77) ~[tez-api-0.9.x.jar:0.9.x]
> 	at org.apache.tez.client.TezClient.start(TezClient.java:405) ~[tez-api-0.9.x.jar:0.9.x]
> 	at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:535)
~[hive-exec-3.x]{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message