hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14138) Remove S3A ref from META-INF service discovery, rely on existing core-default entry
Date Sat, 22 Apr 2017 12:15:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979880#comment-15979880

Steve Loughran commented on HADOOP-14138:

bq. A lot of people consider the core-default files as documentation

they are, but on the basis that you need to create a Configuration(true) for the basics of
talking to anything in a Hadoop cluster, they are effectively a declarative configuration

bq. Hive uses a nice approach where HiveConf.get(ParamName) implicitly picks up default values.
No *-default.xml file here either.

Big issue in Hadoop core is there is no central config point: Configuration, HdfsConfiguration,
YarnConfiguration, JTConf, plus lots of other bits through the code. For bonus fun, different
projects have different world views on visibility of even fieldnames, with the HDFS project
considering even {{HdfsClientConfigKeys}} to be private data.

w.r.t opening this up, if someone actually makes a commit to allocate time to do this, and
others to review it, then it's worth doing. Otherwise it'll just be another abandoned wish
list item. While I find the current situation annoying, I have enough half-complete spare-time-work
items to get in to worry about this. Just load Configuration with default=true and not worry
about it.

> Remove S3A ref from META-INF service discovery, rely on existing core-default entry
> -----------------------------------------------------------------------------------
>                 Key: HADOOP-14138
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14138
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Critical
>             Fix For: 2.8.0, 2.7.4, 3.0.0-alpha3
>         Attachments: HADOOP-14138.001.patch, HADOOP-14138-branch-2-001.patch
> As discussed in HADOOP-14132, the shaded AWS library is killing performance starting
all hadoop operations, due to classloading on FS service discovery.
> This is despite the fact that there is an entry for fs.s3a.impl in core-default.xml,
*we don't need service discovery here*
> Proposed:
> # cut the entry from {{/hadoop-aws/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem}}
> # when HADOOP-14132 is in, move to that, including declaring an XML file exclusively
for s3a entries
> I want this one in first as its a major performance regression, and one we coula actually
backport to 2.7.x, just to improve load time slightly there too

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message