hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests
Date Thu, 25 Aug 2016 19:00:24 GMT

     [ https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wei-Chiu Chuang updated HDFS-10799:
-----------------------------------
    Attachment: HDFS-10799.001.patch

v01: a quick fix. Need tests, and probably need to wrap NameNodeRpcServer#getCurrentEditLogTxid
as well.

> NameNode should use loginUser(hdfs) to serve iNotify requests
> -------------------------------------------------------------
>
>                 Key: HDFS-10799
>                 URL: https://issues.apache.org/jira/browse/HDFS-10799
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0
>         Environment: Kerberized, HA cluster, iNotify client, CDH5.7.0
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDFS-10799.001.patch
>
>
> When a NameNode serves iNotify requests from a client, it verifies the client has superuser
permission and then uses the client's Kerberos principal to read edits from journal nodes.
> However, if the client does not renew its tgt tickets, the connection from NameNode to
journal nodes may fail. In which case, the NameNode thinks the edits are corrupt, and prints
a scary error message:
> "During automatic edit log failover, we noticed that all of the remaining edit log streams
are shorter than the current one!  The best remaining edit log ends at transaction 11577603,
but we thought we could read up to transaction 11577606.  If you continue, metadata will be
lost forever!"
> However, the edits are actually good. NameNode _should not freak out when an iNotify
client's tgt ticket expires_.
> I think that an easy solution to this bug, is that after NameNode verifies client has
superuser permission, call {{SecurityUtil.doAsLoginUser}} and then read edits. This will make
sure the operation does not fail due to an expired client ticket.
> Expert of related logs:
> {noformat}
> 2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:hdfs@EXAMPLE.COM (auth:KERBEROS) cause:java.io.IOException: We encountered an error reading
http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy,
http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy.
 During automatic edit log failover, we noticed that all of the remaining edit log streams
are shorter than the current one!  The best remaining edit log ends at transaction 11577603,
but we thought we could read up to transaction 11577606.  If you continue, metadata will be
lost forever!
> 2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 112 on
8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client IP:port]
Call#73 Retry#0
> java.io.IOException: We encountered an error reading http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy,
http://jn1.example.com:8480/getJournal?jid=nameservice1&segmentTxId=11577487&storageInfo=yyy.
 During automatic edit log failover, we noticed that all of the remaining edit log streams
are shorter than the current one!  The best remaining edit log ends at transaction 11577603,
but we thought we could read up to transaction 11577606.  If you continue, metadata will be
lost forever!
>         at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
>         at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
>         at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message