hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2655) Copy on write for data and metadata files in the presence of snapshots
Date Tue, 05 Feb 2008 01:11:09 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565620#action_12565620

dhruba borthakur commented on HADOOP-2655:

The datanode has a new directory per volume called "detachDir". This directory is used to
do temporary copy-on-write for data blocks that are part of a snapshot.

 When a client writes a block that is linked to a snapshot, it does the following:

1. Create a copy of the original file into a file in the detachDir.
2. Rename the newly created file in detachDir into the original file. This breaks the hardlink
and creates two copies of the block atomically.

Point 2 works perfectly on Linux platform. The following are some caveats on Windows platform.

On Windows platform,  the rename fails because the target file already exists. Thus, the code
issues a delete followed by a rename. This means that there is a window of opportunity (on
Windows) when the block does not exist in the right place. If a read request for the block
occurs precisely in that window, then the client will get an exception and will try to read
that block from an alternate location. (When a datanode restarts, it recovers blocks that
are exist in detachDir but do not exist in the original data directory.) I am proposing that
this is an acceptable solution.

> Copy on write for data and metadata files in the presence of snapshots
> ----------------------------------------------------------------------
>                 Key: HADOOP-2655
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2655
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
> If a DFS Client wants to append data to an existing file (appends, HADOOP-1700) and a
snapshot is present, the Datanoed has to implement some form of a copy-on-write for writes
to data and meta data files.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message