hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12118) Validate xml configuration files with XML Schema
Date Mon, 29 Jun 2015 17:53:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14605993#comment-14605993

Christopher Tubbs commented on HADOOP-12118:

xmllint isn't needed. You can validate in Java:

    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    dbf.setSchema(sf.newSchema(new File("path/to/hadoop-configuration.xsd")));
    Document d = dbf.newDocumentBuilder().parse(new File("path/to/core-site.xml")); // throws
exception if can't parse
    ... // additional checks, manual parsing, getting elements, etc. here

> Validate xml configuration files with XML Schema
> ------------------------------------------------
>                 Key: HADOOP-12118
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12118
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Christopher Tubbs
>         Attachments: HADOOP-7947.branch-2.1.patch, hadoop-configuration.xsd
> I spent an embarrassingly long time today trying to figure out why the following wouldn't
> {code}
> <property>
>   <key>fs.defaultFS</key>
>   <value>hdfs://localhost:9000</value>
> </property>
> {code}
> I just kept getting an error about no authority for {{fs.defaultFS}}, with a value of
{{file:///}}, which made no sense... because I knew it was there.
> The problem was that the {{core-site.xml}} was parsed entirely without any validation.
This seems incorrect. The very least that could be done is a simple XML Schema validation
against an XSD, before parsing. That way, users will get immediate failures on common typos
and other problems in the xml configuration files.

This message was sent by Atlassian JIRA

View raw message