hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuan Gong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient
Date Wed, 13 Apr 2016 17:32:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239658#comment-15239658
] 

Xuan Gong commented on YARN-4955:
---------------------------------

Thanks for the review. Looks like in 
{code}
      public Object run() throws IOException {
        // Try pass the request, if fail, keep retrying
        authUgi.checkTGTAndReloginFromKeytab();
        try {
          return authUgi.doAs(action);
        } catch (UndeclaredThrowableException e) {
          throw new IOException(e.getCause());
        } catch (InterruptedException e) {
          throw new IOException(e);
        }
      }
{code}
All the exceptions would be wrapped as IOException. So, currently catch IOException should
be enough. 
Have modify the patch to get the cause from IOException to check whether it is AuthenticationException.
and then further check the cause for this AuthenticationException.

> Add retry for SocketTimeoutException in TimelineClient
> ------------------------------------------------------
>
>                 Key: YARN-4955
>                 URL: https://issues.apache.org/jira/browse/YARN-4955
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>         Attachments: YARN-4955.1.patch, YARN-4955.2.patch
>
>
> We saw this exception several times when we tried to getDelegationToken from ATS.
> java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException:
java.net.SocketTimeoutException: Read timed out
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
> 	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
> 	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
> 	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
> 	at org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
> 	at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> 	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
> 	at org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
> 	at java.lang.Thread.run(Thread.java:745)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException:
java.net.SocketTimeoutException: Read timed out
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
> 	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
> 	at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
> 	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
> 	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
> 	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567)
> 	... 24 more
> Caused by: java.net.SocketTimeoutException: Read timed out
> 	at java.net.SocketInputStream.socketRead0(Native Method)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:152)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:122)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> 	at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> 	at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:690)
> 	at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633)
> 	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1325)
> 	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.readToken(KerberosAuthenticator.java:357)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.access$300(KerberosAuthenticator.java:54)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:317)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:287)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:287)
> 	... 36 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message