hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9150) Unnecessary DNS resolution attempts for logical URIs
Date Thu, 24 Jan 2013 18:41:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13561840#comment-13561840
] 

Suresh Srinivas commented on HADOOP-9150:
-----------------------------------------

Sorry for the late reply. I was merely asking a question to understand the change and not
suggesting a change, in my previous comment.

bq. The reason that we need to create canonicalizeUri and allow implementations to override
it is that we have canonicalize the URI parameter in checkPath. Since we don't have a FileSystem
instance corresponding to the URI parameter, we have to add this method which takes a URI
for this to work out.
Now I understand the issue better.

Some obesvations/comments:
# With this patch, default FileSystem#getCanonicalUri() implementation changes. The host is
no longer canonicalized for file systems that do not override the method. This is a change
in behavior. However it is taking the behavior prior to changes from HADOOP-7510. So that
should be okay.
# Making getCanonicalUri() final is an API incompatible change for the file system implementations.
While in my pervious comment I asked you about making getCanonicalUri() final, thinking a
bit more, it is not worth making incompatible change. There are two ways now in FileSystem
to override the behavior. getCanoicalUri() and canonicalizeUri(). It is not clean, but may
be worth doing to keep the API compatibility. Also adding a comment to getCanonicalUri() to
say, most file system need to only override canonicalizeUri() and not this method, should
add more clarity.
# It is worth making this change in branch-1, since some new implementations of file systems
could start with that branch.



                
> Unnecessary DNS resolution attempts for logical URIs
> ----------------------------------------------------
>
>                 Key: HADOOP-9150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9150
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3, ha, performance, viewfs
>    Affects Versions: 3.0.0, 2.0.2-alpha
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hadoop-9150.txt, hadoop-9150.txt, hadoop-9150.txt, hadoop-9150.txt,
hadoop-9150.txt, hadoop-9150.txt, hadoop-9150.txt, hadoop-9150.txt, hadoop-9150.txt, log.txt,
tracing-resolver.tgz
>
>
> In the FileSystem code, we accidentally try to DNS-resolve the logical name before it
is converted to an actual domain name. In some DNS setups, this can cause a big slowdown -
eg in one misconfigured cluster we saw a 2-3x drop in terasort throughput, since every task
wasted a lot of time waiting for slow "not found" responses from DNS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message