hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2994) If lease is recovered successfully inline with create, create can fail
Date Mon, 05 Aug 2013 17:50:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729713#comment-13729713
] 

Konstantin Shvachko commented on HDFS-2994:
-------------------------------------------

Looks like the problem is still there.
In case of opening for append if softLimit expired recoverLeaseInternal() may finalize file
and replace myFile with the closed one.
Then prepareFileForWrite() will try to replace the same file again, which will fail because
myFile is an outdated / invalid reference to the old indode.
The right fix is to refresh myFile after recoverLeaseInternal() rather than setting its parent
field as proposed in attached patch.
                
> If lease is recovered successfully inline with create, create can fail
> ----------------------------------------------------------------------
>
>                 Key: HDFS-2994
>                 URL: https://issues.apache.org/jira/browse/HDFS-2994
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.24.0
>            Reporter: Todd Lipcon
>            Assignee: amith
>         Attachments: HDFS-2994_1.patch, HDFS-2994_1.patch
>
>
> I saw the following logs on my test cluster:
> {code}
> 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: startFile:
recover lease [Lease.  Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1,
pendingcreates: 1], src=/benchmarks/TestDFSIO/io_data/test_io_6 from client DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1
> 2012-02-22 14:35:22,887 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Recovering
lease=[Lease.  Holder: DFSClient_attempt_1329943893604_0007_m_000376_0_453973131_1, pendingcreates:
1], src=/benchmarks/TestDFSIO/io_data/test_io_6
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: BLOCK* internalReleaseLease:
All existing blocks are COMPLETE, lease removed, file closed.
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.replaceNode:
failed to remove /benchmarks/TestDFSIO/io_data/test_io_6
> 2012-02-22 14:35:22,888 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile:
FSDirectory.replaceNode: failed to remove /benchmarks/TestDFSIO/io_data/test_io_6
> {code}
> It seems like, if {{recoverLeaseInternal}} succeeds in {{startFileInternal}}, then the
INode will be replaced with a new one, meaning the later {{replaceNode}} call can fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message