hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14132) Filesystem discovery to stop loading implementation classes
Date Fri, 03 Mar 2017 12:22:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894239#comment-15894239
] 

Steve Loughran commented on HADOOP-14132:
-----------------------------------------

I'd avoid HDFS as that may trigger creation of {{HdfsConfiguration()}} objects and so loading
in the relevant defaults. 

For wasb & swift, yes, and for those in common which pull in another JAR (ftp &c)

> Filesystem discovery to stop loading implementation classes
> -----------------------------------------------------------
>
>                 Key: HADOOP-14132
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14132
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/adl, fs/azure, fs/oss, fs/s3, fs/swift
>    Affects Versions: 2.7.3
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>
> Integration testing of Hadoop with the HADOOP-14040 has shown up that the move to a shaded
AWS JAR is slowing all hadoop client code down.
> I believe this is due to how we use service discovery to identify FS implementations:
the implementation classes themselves are instantiated.
> This has known problems today with classloading, but clearly impacts performance too,
especially with complex transitive dependencies unique to the loaded class.
> Proposed: have lightweight service declaration classes which implement an interface declaring
> # schema
> # classname of FileSystem impl
> # classname of AbstractFS impl
> # homepage (for third party code, support, etc)
> These are what we register and scan in the FS to look for services.
> This will leave the question about what to do for existing filesystems? I think we'll
need to retain the old code for external ones, while moving the hadoop modules to the new
ones



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message