hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Sirianni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1741) XInclude support broken for YARN ResourceManager
Date Mon, 24 Feb 2014 19:27:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910703#comment-13910703

Eric Sirianni commented on YARN-1741:

Yes - This was the approach I was planning on investigating with a potential patch.  The trick
is how to most cleanly get that to work with the {{ConfigurationProvider}} API.  Two main
approaches seem possible:
# Change {{ConfigurationProvider.getConfigurationInputStream()}} to return a {{(String, InputStream)}}
# Change {{ConfigurationProvider}} to provide directly into the {{Configuration}} object itself.
 Something like {{ConfigurationProvider.provideTo(Configuration conf)}}.  With this approach,
the different {{ConfigurationProvider}} subclasses could invoke the specific {{conf.addResource()}}
overload that made sense for the subclass.

Based on investigating the usages of {{ConfigurationProvider.getConfigurationInputStream()}},
I was leaning towards the 2nd approach.

> XInclude support broken for YARN ResourceManager
> ------------------------------------------------
>                 Key: YARN-1741
>                 URL: https://issues.apache.org/jira/browse/YARN-1741
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Eric Sirianni
>            Priority: Minor
>              Labels: regression
> The XInclude support in Hadoop configuration files (introduced via HADOOP-4944) was broken
by the recent {{ConfigurationProvider}} changes to YARN ResourceManager.  Specifically, YARN-1459
and, more generally, the YARN-1611 family of JIRAs for ResourceManager HA.
> The issue is that {{ConfigurationProvider}} provides a raw {{InputStream}} as a {{Configuration}}
resource for what was previously a {{Path}}-based resource.  
> For {{Path}} resources, the absolute file path is used as the {{systemId}} for the {{DocumentBuilder.parse()}}
> {code}
>       } else if (resource instanceof Path) {          // a file resource
> ...
>           doc = parse(builder, new BufferedInputStream(
>               new FileInputStream(file)), ((Path)resource).toString());
>         }
> {code}
> The {{systemId}} is used to resolve XIncludes (among other things):
> {code}
>     /**
>      * Parse the content of the given <code>InputStream</code> as an
>      * XML document and return a new DOM Document object.
> ...
>      * @param systemId Provide a base for resolving relative URIs.
> ...
>      */
>     public Document parse(InputStream is, String systemId)
> {code}
> However, for loading raw {{InputStream}} resources, the {{systemId}} is set to {{null}}:
> {code}
>       } else if (resource instanceof InputStream) {
>         doc = parse(builder, (InputStream) resource, null);
> {code}
> causing XInclude resolution to fail.
> In our particular environment, we make extensive use of XIncludes to standardize common
configuration parameters across multiple Hadoop clusters.

This message was sent by Atlassian JIRA

View raw message