hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions
Date Mon, 27 Nov 2017 18:55:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267246#comment-16267246
] 

Chris Douglas commented on HADOOP-14600:
----------------------------------------

bq. replace `listLocatedStatus` call with `listStatusIterator` because it returns FileStatus
rather than LocatedFileStatus and that doesn't trigger all the getPermission() mess at all
Good point. Unfortunately, for everything accepted by the filter (which defaults to accepting
everything, IIRC), we double the RPCs if the client subsequently asks for locations. That's
bad for HDFS, but irrelevant to the local FS and object stores that don't report locality
information.

[~myapachejira], have you had a chance to verify the patch, yet?

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions
> ----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-14600
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14600
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.7.3
>         Environment: file:// in a dir with many files
>            Reporter: Steve Loughran
>            Assignee: Ping Liu
>         Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, HADOOP-14600.003.patch,
HADOOP-14600.004.patch, HADOOP-14600.005.patch, HADOOP-14600.006.patch, HADOOP-14600.007.patch,
HADOOP-14600.008.patch, HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws against the local
FS, because {{FileStatus.getPemissions}} call forces  {{DeprecatedRawLocalFileStatus}} tp
spawn a process to read the real UGI values.
> That is: for every other FS, what's a field lookup or even a no-op, on the local FS it's
a process exec/spawn, with all the costs. This gets expensive if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message