hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2366) Space in the value for dfs.data.dir can cause great problems
Date Tue, 16 Jun 2009 20:35:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720335#action_12720335
] 

Tsz Wo (Nicholas), SZE commented on HADOOP-2366:
------------------------------------------------

> If tests pass after fixing the behavior on an empty conf, do you have an issue with changing
the semantics of these utility functions so long as the new behavior is clearly documented
in the javadoc?

Configuration is a public class and is not a part of fs or hdfs.  Trimming the string values
may make sense in fs/hdfs paths but it may not for the other usages.  Personally, I wish the
trimming was done in the beginning.  Unfortunately, it was not.  If we change it now, then
it breaks existing semantics.  I think that users rarely use leading or tailing spaces in
configuration values but we cannot break them.

When I worked on HADOOP-2461, I think that the property names should be trimmed but not the
values. Otherwise, it forbids the potential use of leading and trailing spaces. If there is
a need, the codes using the conf values should do the trimming.  In this issue, only the values
for dfs.data.dir should be trimmed.

If a trimmed version of getStrings(..) is needed, I think it is better to provide new methods,
say getTrimmedStrings(..) in Configuration and StringUtils but not changing the existing ones.

> Space in the value for dfs.data.dir can cause great problems
> ------------------------------------------------------------
>
>                 Key: HADOOP-2366
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2366
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: conf
>            Reporter: Ted Dunning
>            Assignee: Todd Lipcon
>         Attachments: HADOOP-2366.patch
>
>
> The following configuration causes problems:
> <property>
>   <name>dfs.data.dir</name>
>   <value>/mnt/hstore2/hdfs, /home/foo/dfs</value>  
>   <description>
>   Determines where on the local filesystem an DFS data node  should store its bl
> ocks.  If this is a comma-delimited  list of directories, then data will be stor
> ed in all named  directories, typically on different devices.  Directories that 
> do not exist are ignored.  
>   </description>
> </property>
> The problem is that the space after the comma causes the second directory for storage
to be " /home/foo/dfs" which is in a directory named <SPACE> which contains a sub-dir
named "home" in the hadoop datanodes default directory.  This will typically cause the user's
home partition to fill, but will be very hard for the user to understand since a directory
with a whitespace name is hard to understand.
> My proposed solution would be to trimLeft all path names from this and similar property
after splitting on comma.  This still allows spaces in file and directory names but avoids
this problem. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message