hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-2991) failure to load edits: ClassCastException
Date Thu, 23 Feb 2012 19:16:53 GMT

     [ https://issues.apache.org/jira/browse/HDFS-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Konstantin Shvachko updated HDFS-2991:
--------------------------------------

    Target Version/s: 0.24.0, 0.23.1, 0.22.1  (was: 0.23.1, 0.24.0)

Seems like 0.22 has the same problem.
{{startFileInternal()}} journals OP_ADD for the new file via {{dir.addFile()}}, but never
does it for opening for append.
On the patch.
# Todd, you should first logOpenFile() then convertLastBlockToUnderConstruction(), because
the latter can throw IOException, and we will end up with the hanging OP_ADD in edits, leading
to fake recovery on startup.
# You do not need to logOpenFile() in case of file creation. It is already done in dir.addFile().
Would be very confusing to journal the same transaction twice.
# Not wild about changing LAYOUT_VERSION to overcome the bug. Should we rather come up with
a repair tool? Should be easy to implement with OIV. 
Changing LAYOUT_VERSION will not be as simple as just incrementing. We will have to do a simultaneous
jump for all versions, as we did in the past.
                
> failure to load edits: ClassCastException
> -----------------------------------------
>
>                 Key: HDFS-2991
>                 URL: https://issues.apache.org/jira/browse/HDFS-2991
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.24.0, 0.23.1
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-2991.txt, image-with-buggy-append.tgz
>
>
> In doing scale testing of trunk at r1291606, I hit the following:
> java.io.IOException: Error replaying edit log at offset 1354251
> Recent opcode offsets: 1350014 1350176 1350312 1354251
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:418)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93)
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:79)
> ...
> Caused by: java.lang.ClassCastException: org.apache.hadoop.hdfs.server.namenode.INodeFile
cannot be cast to org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction
>         at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
>         ... 13 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message