xml-xalan-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Emberson <rember...@outerharbor.com>
Subject xpath and namespace
Date Thu, 29 Aug 2002 15:46:48 GMT
Hi,

Here is the situation.
There is a document type; it is specified with an xmlschema with the
namespace "http://bigco.com".
Here are a couple instance documents:

document (A):
<?xml version="1.0"?>
<TOP>
    <ONE>C6217</ONE>
    <TWO>PO1001</TWO>
</TOP>

document (B):
<TOP xmlns="" >
    <ONE>C6217</ONE>
    <TWO>PO1001</TWO>
</TOP>

document (C):
<?xml version="1.0"?>
<TOP xmlns="http://bigco.com">
    <ONE>C6217</ONE>
    <TWO>PO1001</TWO>
</TOP>

document (D):
<?xml version="1.0"?>
<foo:TOP xmlns:foo="http://bigco.com">
    <foo:ONE>C6217</foo:ONE>
    <foo:TWO>PO1001</foo:TWO>
</foo:TOP>

(Yes, document A and B are the same and because neither has the
namespace
"http://bigco.com" declared they are not "really" valid.)


There are three participating entities in the system:
The first, player 1,  is the user who defines an xpath into the
document.
This xpath does not have prefixes!!!
An example xpath is:

xpath = "/TOP/ONE/text()"

The second, player 2, is a user that generates an instance document -
eiher A, B, C, or D.

The last entity, the central system, takes player 1's the xpath (with no

prefixes) and applies it to the document generated by player 2.
The results of the application of the xpath to the document is
further processed but is not of interest here.
What is of concern is how to build the central system so that it
can accommodate all 4 "types" of input documents.

The first two document "types", A and B work just fine using:

        XPathAPI xpa = new XPathAPI();
        return xpa.selectSingleNode(doc, xpath);

For document "type" C there is the namespace attribute definition
xmlns="http://bigco.com" in the TOP element to deal with.
Leaving it as is is a no-go. So what I do is I remove the
attribute node (and then I re-serialize the document and reparse it
so that I have a document without the namespace attribute - simply
removing the node does not work - I image that the original document
DOM is in some way caching the fact that it once had namespace attribute

but this is a xerces problem).
This works, the xpath can be applied successfully.

Lastly, document of "type" D is the problem.

One solution might be
to walk the xpath expression and add the namespace prefix (in the
above example of document D the prefix is "foo") to all the approriate
places in the xpath - I could not find any support for this type of
surgery in the xalan xpath library and did not want to step up and
write my own lexer simply to add the prefix on a case-by-case basis.

Another is to apply some namespace prefix, say "bar", to the xpath just
once:

xpath = "/bar:TOP/bar:ONE/text()"

(again, I could not find support for such an operation) and then
use a context document:

<?xml version='1.0' encoding='utf-8'?>
<bar:HOEDEEDO xmlns:bar='http://bigco.com'/>"

and the xpath api

        XPathAPI xpa = new XPathAPI();
        return xpa.selectSingleNode(doc, xpath, contextDocument);

This would work, but how to automatically walk the original xpath and
apply the
namespace prefix at all the right places.

Another solution might be to walk documents of "type" D and remove the
namespace prefixes ... but this is not a xalan problem and there may
already be sub elements in the document from another namespace
which is the default one for the document.

Any help would be appreciated.
Thanks.
Richard



Mime
View raw message