hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4044) Create symbolic links in HDFS
Date Tue, 07 Oct 2008 21:38:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637663#action_12637663
] 

Doug Cutting commented on HADOOP-4044:
--------------------------------------

> so.. this is a performance compromise. But the arguments till now have been about "perfect"
...

This issue has long been performance-related.  The assumption is that it is unacceptable for
the addition of symlinks to double or more the number of RPCs for common HDFS operations when
no links are present.  I have noted this assumption on several occasions and no one has disputed
it.  Do you agree with this assumption, or do you feel we should examine it more?

> in the patch it is not clalled 'nextLinkOrOpenImpl()'.. it is just called 'openImpl()'.


I thought the point of contention is not the name of the method or return value but whether
to use an exception or a data structure to pass the return value.  (That pseudo code also
referred to FileHandle, which is also not in the patch.)  Would you like to propose better
names for the methods and objects in the patch?  I am not wedded to the names used in the
patch, but perhaps we should resolve the exception-related issue first?

> This is obviously an exaggerated example but hopefully makes a point.

I'm not sure what the point is.  If the RPC system wished to return some generic data for
all protocols then it could not force a different return type, since return types are protocol-specific,
so we'd probably need to add a method to the RPC runtime.  If the situation is unusual, then
an exception would be more appropriate.

HTTP is perhaps an analogy.  It uses the return code to indicate a redirect.  URLConnection.html#getInputStream()
doesn't throw an exception when it encounters a redirect.  It either follows the redirect
or returns the text of the redirect page, depending on whether you've set HttpURLConnection#setFollowRedirects.
 Redirects are part of the protocol.  The HTTP return type is complex.  That's the model we
must move towards if we wish to handle links without adding lots more RPCs.

> Create symbolic links in HDFS
> -----------------------------
>
>                 Key: HADOOP-4044
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4044
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: symLink1.patch, symLink1.patch, symLink4.patch, symLink5.patch,
symLink6.patch, symLink8.patch, symLink9.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file that contains
a reference to another file or directory in the form of an absolute or relative path and that
affects pathname resolution. Programs which read or write to files named by a symbolic link
will behave as if operating directly on the target file. However, archiving utilities can
handle symbolic links specially and manipulate them directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message