hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5709) Improve upgrade with existing files and directories named ".snapshot"
Date Tue, 21 Jan 2014 18:06:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877671#comment-13877671
] 

Suresh Srinivas commented on HDFS-5709:
---------------------------------------

bq.could you lay out your alternative proposal for a conf option? I could rename the conf
and make it so it takes a delimited set of kv pairs, e.g. ".snapshot=.user-snapshot,.some-new-reserved=.renamed-new-reserved",
but I felt that was kind of ugly. I wanted full sub rather than prefix here for flexibility.
This is not what I have in mind. /<path>/<reserved_file> could be renamed to /<path>/<reserved_file>+<configured_rename_suffix>.
The user finds all the renamed files (from the log) and renames them once the system comes
up, if necessary. In fact coming to think of it, I think the suffix should probably be not
configurable either. The system can choose some convention where rename suffix could be -
".<layout_version>.reserved_renamed_after_ugprade". The user must run -upgrade with
a new option that allows renaming of reserved file.

The advantages of this are:
# Post upgrade at any time (until user renames the file), all the renamed files can be found
# The probability of conflict of file names during upgrade is lower
# Unnecessary configuration changes are avoided

bq. With just a prefix mutation, I could easily imagine having to run some operation after
the NN had started up to then find all of the renamed paths and again rename them to some
other name with the prefix removed. That's a pain that we shouldn't put our users through.
I do not understand what the pain is. It is just renaming the files. Also what if a user wants
to rename .snapshot in one directory to x and .snapshot in another directory to y, based on
the context of how the file is being used.

bq.  Empirically we rarely add reserved names (only two in the lifetime of HDFS so far) and
I don't anticipate adding many more
If we looked at this same question last year, we would have said we will never have reserved
names in HDFS at all.  I think there are some that I have plans of adding. I want to add some
directories in the future for storing file system specific information. This is the way I
envision moving fsimage and possibly finalized editlog segments into HDFS itself.

bq. ... but rather to do as Andrew suggests and add a conf option per reserved name, and add
the code necessary for each one as it comes up....
I think adding one configuration for each reserved name does not seem right. In fact, if possible
we should avoid configuration changes at all for renames that are required *one time* during
upgrade to a version that adds reserved file names.




> Improve upgrade with existing files and directories named ".snapshot"
> ---------------------------------------------------------------------
>
>                 Key: HDFS-5709
>                 URL: https://issues.apache.org/jira/browse/HDFS-5709
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>              Labels: snapshots, upgrade
>         Attachments: hdfs-5709-1.patch, hdfs-5709-2.patch, hdfs-5709-3.patch, hdfs-5709-4.patch,
hdfs-5709-5.patch
>
>
> Right now in trunk, upgrade fails messily if the old fsimage or edits refer to a directory
named ".snapshot". We should at least print a better error message (which I believe was the
original intention in HDFS-4666), and [~atm] proposed automatically renaming these files and
directories.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message