hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brahma Reddy Battula (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2994) If lease is recovered successfully inline with create, create can fail
Date Wed, 18 Apr 2012 12:33:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256488#comment-13256488
] 

Brahma Reddy Battula commented on HDFS-2994:
--------------------------------------------

Hi,

I am able to reproduce same

 *sceanrio 1 Using debug point* 
  ============================
Write a file /home/a.txt
call Append to /home/a.txt.
put a debugpoint in dfsclient at leaserenewer.put(src, result, this);
when control come to above point just renamefile to /home/rename.txt

Now again try to append to renamed file(/home/rename.txt)..then I am getting same exception

{noformat}
java.io.IOException: FSDirectory.replaceNode: failed to remove /home/rename.txt
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.replaceNode(FSDirectory.java:1119)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.prepareFileForWrite(FSNamesystem.java:1674)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1612)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1823)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:417)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:217)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42592)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:423)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891)
{noformat}


 *Scenario 2:* 
  ==========

step 1:write a file /home/a.txt(size 2MB)
step 2:call append on /home/a.txt(size 1.5MB)
restart DN while second step inprogess multiple times.

then I am getting same



{noformat}
java.io.IOException: FSDirectory.replaceNode: failed to remove /home/a.txt
	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.replaceNode(FSDirectory.java:1119)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.prepareFileForWrite(FSNamesystem.java:1670)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1608)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1819)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:416)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:217)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42592)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657)
	at java.security.AccessController.doPrivileged(Native Method)


{noformat}


                
> If lease is recovered successfully inline with create, create can fail
> ----------------------------------------------------------------------
>
>                 Key: HDFS-2994
>                 URL: https://issues.apache.org/jira/browse/HDFS-2994
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.24.0
>            Reporter: Todd Lipcon
>
> I saw the following logs on my test cluster:
> {code}
> 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile:
recover lease [Lease.  Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1,
pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 from client DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1
> 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering
lease=[Lease.  Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1, pendingcreates:
1], src=/benchmarks/TestDFSIO/io_data/test_io_6
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease:
All existing blocks are COMPLETE, lease removed, file closed.
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.replaceNode:
failed to remove /benchmarks/TestDFSIO/io_data/test_io_6
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile:
FSDirectory.replaceNode: failed to remove /benchmarks/TestDFSIO/io_data/test_io_6
> {code}
> It seems like, if {{recoverLeaseInternal}} succeeds in {{startFileInternal}}, then the
INode will be replaced with a new one, meaning the later {{replaceNode}} call can fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message