hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey
Date Wed, 24 May 2017 21:48:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023750#comment-16023750
] 

Wei-Chiu Chuang commented on HDFS-11741:
----------------------------------------

Thanks [~xiaochen].
I like your suggestion. I initially just wanted to maintain the parity to DFSClient#newDataEncryptionKey.
But that's actually not needed: DFSClient does not have access to block key, so it has to
ask NameNode for DEK. Balancer KeyManager has access to block key, so it can generate DEK
on its own, no extra overhead for NN.

> Long running balancer may fail due to expired DataEncryptionKey
> ---------------------------------------------------------------
>
>                 Key: HDFS-11741
>                 URL: https://issues.apache.org/jira/browse/HDFS-11741
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer & mover
>         Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. Balancer login
using keytab
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDFS-11741.001.patch, HDFS-11741.002.patch, HDFS-11741.003.patch,
HDFS-11741.004.patch
>
>
> We found a long running balancer may fail despite using keytab, because KeyManager returns
expired DataEncryptionKey, and it throws the following exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher (Dispatcher.java:dispatch(325))
- Failed to move blk_1067352712_3913241 with size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK
through 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: Can't re-compute
encryption key for nonce, since the required block key (keyID=1005215027) doesn't exist. Current
key: 1005215030
>         at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
>         at org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
>         at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
>         at org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
>         at org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager actively synchronizes
itself with NameNode w.r.t block keys, it does not update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default block token
expiration/life time 10 hours, a long running balancer failed after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message