hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10034) optimize same-filesystem symlinks by doing resolution server-side
Date Wed, 09 Oct 2013 19:04:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790727#comment-13790727

Colin Patrick McCabe commented on HADOOP-10034:

bq. The suggestion here is akin to a NFS server resolving symlinks for the client.

With NFS, the client caches metadata for a configurable amount of time.  NFS also uses "file
handles" extensively rather than path names.  HDFS has no such client-side caching, and we
use full paths all over the place.  We have nothing to prevent the client from hammering the
server when symlinks are in use.

bq. For instance, I've considered implementing a filter fs to provide transparent har support.
It would rely on "seeing" the har extension in the path, mask it to look like a standard directory,
and delegating the remainder of the path to har. Symlinks in the path to the har will break
this implementation because the client won't see the har extension and the NN will throw a

If the {{NameNode}}'s FNF contained the path it was looking for, you could simply catch the
exception, examine the path, and continue the resolution.

> optimize same-filesystem symlinks by doing resolution server-side
> -----------------------------------------------------------------
>                 Key: HADOOP-10034
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10034
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs
>            Reporter: Colin Patrick McCabe
> We should optimize same-filesystem symlinks by doing resolution server-side rather than
client side, as discussed on HADOOP-9780.

This message was sent by Atlassian JIRA

View raw message