hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4663) Datanode should delete files under tmp when upgraded from 0.17
Date Thu, 29 Jan 2009 01:00:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668266#action_12668266

dhruba borthakur commented on HADOOP-4663:

An offline discussion with Hairong resulted in detecting that a special purpose block report
(for blocks in blocksBeingWritten directory) needs to be sent by the datanode at start-up
time. The Namenode has to process this report specially: it should not insert these blocks
into blocksMap, instead it should update the targets of the last blocks for filesUnderConstruction.

Given the "special" needs of the above, I think it is better if we promote the blocks from
"blocksBingWriten" directory to the main directory (after matching/truncating sizes of block
files and their crc files) into the main data directory. 

So, I propose that we do the following:
At start-up, the DN matches the blocks in the "blocksBeingWritten" directory with their meta
files. If the size do not match, then the datafile is truncated to match the length described
by the CRCs in the metafile. This ensures that this block is likely to be a valid one. Then,
these blocks are promoted to the main block directory. (The generation-stamp-protocol will
detect inconsistent replicas during lease recovery)

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, handleTmp1.patch
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  directory since
these files are not valid anymore. But in 0.18 it moves these files to normal directory incorrectly
making them valid blocks. One of the following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete them.
> Currently effect of this bug is that, these files end up failing block verification and
eventually get deleted. But cause incorrect over-replication at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be defined better.
Right now there are probably one or two more bugs with it. Dhruba, please file them if you

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message