hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11197) Listing encryption zones fails when deleting a EZ that is on a snapshotted directory
Date Wed, 07 Dec 2016 18:01:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15729387#comment-15729387
] 

Wei-Chiu Chuang edited comment on HDFS-11197 at 12/7/16 6:01 PM:
-----------------------------------------------------------------

That fails because of this error:
{noformat}
2016-12-07 00:51:50,706 [IPC Server handler 6 on 59757] INFO  ipc.Server (Server.java:logException(2697))
- IPC Server handler 6 on 59757, call Call#819 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create
from 127.0.0.1:42394
org.apache.hadoop.hdfs.server.namenode.RetryStartFileException: Preconditions for creating
a file failed because of a transient error, retry create later.
	at org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getFileEncryptionInfo(FSDirEncryptionZoneOp.java:330)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2249)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2175)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:742)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:420)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2653)
{noformat}
I have seen the exact same test error in the past, so most likely it's not your patch that
caused it. [~xiaochen] filed HDFS-11093 for this previously.


was (Author: jojochuang):
That fails because of this error:
{noformat}
2016-12-07 00:51:50,706 [IPC Server handler 6 on 59757] INFO  ipc.Server (Server.java:logException(2697))
- IPC Server handler 6 on 59757, call Call#819 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.create
from 127.0.0.1:42394
org.apache.hadoop.hdfs.server.namenode.RetryStartFileException: Preconditions for creating
a file failed because of a transient error, retry create later.
	at org.apache.hadoop.hdfs.server.namenode.FSDirEncryptionZoneOp.getFileEncryptionInfo(FSDirEncryptionZoneOp.java:330)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2249)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2175)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:742)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:420)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2653)
{noformat}
I have seen the exact same test error in the past, so most likely it's not your patch that
caused it. I can file a jira if there's no one filed previously.

> Listing encryption zones fails when deleting a EZ that is on a snapshotted directory
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-11197
>                 URL: https://issues.apache.org/jira/browse/HDFS-11197
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.6.0
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Minor
>         Attachments: HDFS-11197-1.patch, HDFS-11197-2.patch, HDFS-11197-3.patch, HDFS-11197-4.patch,
HDFS-11197-5.patch, HDFS-11197-6.patch, HDFS-11197-7.patch
>
>
> If a EZ directory is under a snapshotable directory, and a snapshot has been taking,
then if this EZ is permanently deleted, it causes *hdfs crypto listZones* command to fail
without showing any of the still available zones.
> This happens only after the EZ is removed from Trash folder. For example, considering
*/test-snap* folder is snapshotable and there is already an snapshot for it:
> {noformat}
> $ hdfs crypto -listZones
> /user/systest           my-key
> /test-snap/EZ-1       my-key
> $ hdfs dfs -rmr /test-snap/EZ-1
> INFO fs.TrashPolicyDefault: Moved: 'hdfs://ns1/test-snap/EZ-1' to trash at: hdfs://ns1/user/hdfs/.Trash/Current/test-snap/EZ-1
> $ hdfs crypto -listZones
> /user/systest           my-key
> /user/hdfs/.Trash/Current/test-snap/EZ-1  my-key 
> $ hdfs dfs -rmr /user/hdfs/.Trash/Current/test-snap/EZ-1
> Deleted /user/hdfs/.Trash/Current/test-snap/EZ-1
> $ hdfs crypto -listZones
> RemoteException: Absolute path required
> {noformat}
> Once this error happens, *hdfs crypto -listZones* only works again if we remove the snapshot:
> {noformat}
> $ hdfs dfs -deleteSnapshot /test-snap snap1
> $ hdfs crypto -listZones
> /user/systest           my-key
> {noformat}
> If we instead delete the EZ using *skipTrash* option, *hdfs crypto -listZones* does not
break:
> {noformat}
> $ hdfs crypto -listZones
> /user/systest           my-key
> /test-snap/EZ-2  my-key
> $ hdfs dfs -rmr -skipTrash /test-snap/EZ-2
> Deleted /test-snap/EZ-2
> $ hdfs crypto -listZones
> /user/systest           my-key
> {noformat}
> The different behaviour seems to be because when removing the EZ trash folder, it's related
INode is left with no parent INode. This causes *EncryptionZoneManager.listEncryptionZones*
to throw the seen error, when trying to resolve the inodes in the given path.
> Am proposing a patch that fixes this issue by simply performing an additional check on
*EncryptionZoneManager.listEncryptionZones* for the case an inode has no parent, so that it
would be skipped on the list without trying to resolve it. Feedback on the proposal is appreciated.

>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message