hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2514) Link resolution bug for intermediate symlinks with relative targets
Date Sun, 30 Oct 2011 01:57:32 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139518#comment-13139518
] 

Eli Collins commented on HDFS-2514:
-----------------------------------

Here's a description of how link resolution works which illustrates the bug, followed by a
fix. The format below is "call by client" -> "value returned by server". The call to "parent"
below occurs in FC#qualifySymlinkTarget.

h2. Absolute links

h3. Link is the 1st path component

Given link -> /file
FC#op(/link) --> ULE
FC#getLinkTarget(/link) --> /file   # Because link is final path component
FC#op(/file)                        # Because "/file" is absolute

h3. Link is the intermediate component

Given link -> /dir2
FC#op(/dir/link/file) -->  ULE          
FC#getLinkTarget(/dir/link/file) --> /dir2/file   # Returns target + remainder ("file")
FC#op(/dir2/file)                                 # Because "/dir2" is absolute

h3. Link is the final component

Given link -> /file
FC#op(/dir/link) --> ULE
FC#getLinkTarget(/dir/link) --> /file   # Because link is final path component
FC#op(/file)                            # Because "/file" is absolute

h2. Relative links

h3. Link is the 1st path component

Given link -> file
FC#op(/link) --> ULE
FC#getLinkTarget(/link) --> file    # Because link is final path component
FC#op(parent(/link) + file)         # Because "file" is relative

h3. Link is the intermediate component

Give link -> dir2
FC#op(/dir/link/file) -->  ULE          
FC#getLinkTarget(/dir/link/file) --> dir2/file   # Returns target + remainder ("dir2/file")
FC#op(parent(/dir/link/file) + dir2/file)        # Because dir2 is relative (parent + target)

*This is /dir/link/dir2/file, which is incorrect.* This should be parent(/dir/link) + dir2/file.
But the client uses the full path, it doesn't know where the link in the path is.

h3. Link is the final component

Given link -> file
FC#op(/dir/link) --> ULE
FC#getLinkTarget(/dir/link) --> file    # Because link is final path component
FC#op(parent(/dir/link) + file)         # Because "file" is relative (parent + target)

h2. Proposed fix

In the above example, suppose we name the path components as follows..
path:       /dir/link/file
target:     dir2
preceding:  /dir
remainder:  /file

We change how the server resolves the path (in UnresolvedPathException#getResolvedPath). 
The server is never given a relative path. If path refers to a link we return the target of
the link verbatim (as we currently do). If target is absolute we return target + remainder
(as we currenty do), otherwise we return preceding + target + remainder. Ie we resolve the
link correctly by replacing  the link in the path with it's target. Ie we no longer return
a relative path, we're resolving the relative path in path (which is absolute) and return
that.

We don't need to change the client. If getLinkTarget returns an absolute path we ignore parent
of the link (as we currently do). If getLinkTarget returns a relative path (because it was
called with the link being the final path component - which we know now because the link target
is not absolute) then we append the link's parent and the target, which is now correct since
we know the link is the final path component.

Here's how the failing example works now..

Give link -> dir2
FC#op(/dir/link/file) -->  ULE          
FC#getLinkTarget(/dir/link/file) --> /dir/dir2/file   # Returns preceding + target + remainder
FC#op(/dir/dir2/file)                                 # Because /dir/dir2/file is absolute
                
> Link resolution bug for intermediate symlinks with relative targets
> -------------------------------------------------------------------
>
>                 Key: HDFS-2514
>                 URL: https://issues.apache.org/jira/browse/HDFS-2514
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0, 0.22.0, 0.23.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hdfs-2514-1.patch
>
>
> There's a bug in the way the Namenode resolves intermediate symlinks (ie the symlink
is not the final path component) in paths when the symlink's target is a relative path. Will
post the full description in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message