xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Kjome <h...@visi.com>
Subject Re: how do I detect internal subset when part of external subset?
Date Fri, 07 Apr 2006 03:07:57 GMT

Thanks for the tip, Elliotte.  I'll remember it 
when I use SAX.  I'm using XNI in this case.  I 
suppose I could use SAX, but I'm really just 
trying to migrate from Xerces1 to Xerces2 for 
XMLC.  XMLC already depends directly on Xerces 
because of the custom DOM's XMLC implements.  I 
also wanted to change as little as possible.  I 
may make more radical changes once I've proven 
that I can make things work properly with minimal changes.

In any case, I think I've got the internal subset 
stuff working, except for one thing.  Take the following document...

<?xml version="1.0" standalone="no"?>
<!DOCTYPE document SYSTEM "document.dtd" [
   <!ENTITY head SYSTEM "header.xml">
   <!ENTITY foot SYSTEM "footer.xml">
   <!ENTITY torso SYSTEM "body.xml">
   <!ENTITY erh "Elliotte Rusty Harold">
]>
<document>
   &head; &torso; &foot;
</document>

The only part of this that ends up in the 
internal subset is the "erh" entity.  That is, 
the internalEntityDecl() method gets called only 
for the "erh" entity and is not notified at all 
for the other entities.  Then, as I build up the 
DOM, I create EntityReference's for "&head; 
&torso; &foot;" in the <document>.  Upon 
serialization, they end up being there in the 
document, but since I was never notified to 
create the corresponding <!ENTITY> elements in 
the internal subset, re-parsing of the serialized 
document fails.  So, how do I get notified about 
these so I can get them into the DOM unparsed?  I 
want the serialized DOM to look as identical as 
possible to the above.  I must be missing something.


Jake


At 06:41 AM 4/4/2006, you wrote:
 >The trick is to look for the entity name "[dtd]". XOM accomplishes this
 >thusly using pure SAX:
 >
 >
 >     protected boolean inExternalSubset = false;
 >
 >     // We have a problem here. Xerces gets this right,
 >     // but Crimson and possibly other parsers don't properly
 >     // report these entities, or perhaps just not tag them
 >     // with [dtd] like they're supposed to.
 >     public void startEntity(String name) {
 >       if (name.equals("[dtd]")) inExternalSubset = true;
 >     }
 >
 >
 >     public void endEntity(String name) {
 >       if (name.equals("[dtd]")) inExternalSubset = false;
 >     }
 >
 >You can just reverse the logic if you prefer inInternalSubset.
 >
 >--
 >Elliotte Rusty Harold  elharo@metalab.unc.edu
 >XML in a Nutshell 3rd Edition Just Published!
 >http://www.cafeconleche.org/books/xian3/
 >http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
 >
 >---------------------------------------------------------------------
 >To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
 >For additional commands, e-mail: general-help@xml.apache.org
 >
 >
 > 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message