hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9774) RawLocalFileSystem.listStatus() return absolute paths when input path is relative on Windows
Date Sun, 25 Aug 2013 05:14:53 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749533#comment-13749533
] 

Chris Nauroth commented on HADOOP-9774:
---------------------------------------

I've completed a full test run for all sub-projects with this patch on Windows, and there
was no negative impact.  I think we just need some additional confirmation on Linux at this
point.

[~ivanmi], would you mind taking care of the commit after all testing has been addressed?
 I'm going to be mostly offline until 9/2.
                
> RawLocalFileSystem.listStatus() return absolute paths when input path is relative on
Windows
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9774
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9774
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 3.0.0, 2.1.0-beta
>            Reporter: shanyu zhao
>            Assignee: shanyu zhao
>         Attachments: HADOOP-9774-2.patch, HADOOP-9774-3.patch, HADOOP-9774-4.patch, HADOOP-9774-5.patch,
HADOOP-9774.patch
>
>
> On Windows, when using RawLocalFileSystem.listStatus() to enumerate a relative path (without
drive spec), e.g., "file:///mydata", the resulting paths become absolute paths, e.g., ["file://E:/mydata/t1.txt",
"file://E:/mydata/t2.txt"...].
> Note that if we use it to enumerate an absolute path, e.g., "file://E:/mydata" then the
we get the same results as above.
> This breaks some hive unit tests which uses local file system to simulate HDFS when testing,
therefore the drive spec is removed. Then after listStatus() the path is changed to absolute
path, hive failed to find the path in its map reduce job.
> You'll see the following exception:
> [junit] java.io.IOException: cannot find dir = pfile:/E:/GitHub/hive-monarch/build/ql/test/data/warehouse/src/kv1.txt
in pathToPartitionInfo: [pfile:/GitHub/hive-monarch/build/ql/test/data/warehouse/src]
> [junit] 	at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:298)
> This problem is introduced by this JIRA:
> HADOOP-8962
> Prior to the fix for HADOOP-8962 (merged in 0.23.5), the resulting paths are relative
paths if the parent paths are relative, e.g., ["file:///mydata/t1.txt", "file:///mydata/t2.txt"...]
> This behavior change is a side effect of the fix in HADOOP-8962, not an intended change.
The resulting behavior, even though is legitimate from a function point of view, break consistency
from the caller's point of view. When the caller use a relative path (without drive spec)
to do listStatus() the resulting path should be relative. Therefore, I think this should be
fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message