hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kitti Nanasi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15655) KMS should retry upon SocketTimeoutException
Date Fri, 10 Aug 2018 11:54:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576162#comment-16576162
] 

Kitti Nanasi commented on HADOOP-15655:
---------------------------------------

Thanks [~gabor.bota] for the comment! I agree that other possible invocations should be tested
as well, but I modified the code in patch v003 to only affect LoadBalancingKMSClientProvider,
so with the newest patch it makes sense to only test it in TestLoadBalancingKMSClientProvider.
Retrying upon SocketTimeoutException in other cases than LoadBalancingKMSClientProvider
might cause unexpected behaviour.

I also added "isIdempotent" flag to KMS operations, so it can be passed down to FailoverOnNetworkExceptionRetry
policy, which will retry on IOExceptions regardless if the operation is idempotent. So the
new implementation will retry on SocketTimeoutException as well in case of the operation
is idempotent.

> KMS should retry upon SocketTimeoutException
> --------------------------------------------
>
>                 Key: HADOOP-15655
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15655
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: kms
>    Affects Versions: 3.1.0
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Critical
>         Attachments: HADOOP-15655.001.patch, HADOOP-15655.002.patch, HADOOP-15655.003.patch
>
>
> KMS doesn't retry upon SocketTimeoutException (the ssl connection was established, but
the handshake timed out).
> {noformat}
> 6:08:55.315 PM	WARN	KMSClientProvider	
> Failed to connect to example.com:16000
> 6:08:55.317 PM	WARN	LoadBalancingKMSClientProvider	
> KMS provider at [https://example.com:16000/kms/v1/] threw an IOException: 
> java.net.SocketTimeoutException: Read timed out
> 	at java.net.SocketInputStream.socketRead0(Native Method)
> 	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:171)
> 	at java.net.SocketInputStream.read(SocketInputStream.java:141)
> 	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
> 	at sun.security.ssl.InputRecord.read(InputRecord.java:503)
> 	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
> 	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
> 	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
> 	at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
> 	at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
> 	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
> 	at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:153)
> 	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186)
> 	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:140)
> 	at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
> 	at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:333)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:478)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:473)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:472)
> 	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:788)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
> 	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
> 	at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
> 	at org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
> 	at org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
> 	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:949)
> 	at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:338)
> 	at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:423)
> 	at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:260)
> 	at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:151)
> 	at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:122)
> 	at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:795)
> 	at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2036)
> 	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
> 	at java.lang.Thread.run(Thread.java:748)
> 6:08:55.346 PM	WARN	LoadBalancingKMSClientProvider	
> Aborting since the Request has failed with all KMS providers(depending on hadoop.security.kms.client.failover.max.retries=1
setting and numProviders=1) in the group OR the exception is not recoverable
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message