xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrews, Scott" <Andre...@ctcnsc.org>
Subject RE: How do I parse a DTD in Java?
Date Tue, 13 Jan 2004 15:42:30 GMT
So I've discovered - but I still give me an answer to my question.

The best path I've found thus far is to convert the DTDs into XML schema
and use Sun's XSOM API to parse the XML Schema into memory.

I'm trying this out now.  JAXB looked good from the surface, but then
one learns this requires a manual compile of the XML Schema or DTD, and
then you have to write code specific to the Java Objects generated from
the compile.

I need something dynamic, where the structure of the XML document is
unknown until parse time, as if the user specifies the DTD.  I want to,
at run-time, iterate over the structure and provide the user with
actions to be taken when those elements are encountered in the XML
documents. 
  _____  

Scott Andrews
Principle Software Engineer
Concurrent Technologies Corporation
(814) 269 6580 (Monday, Wednesday, Friday)
(814) 632 9559 (Tuesday, Thursday)
(814) 880 8522 (Cell)
 

-----Original Message-----
From: Anne Thomas Manes [mailto:anne@manes.net] 
Sent: Tuesday, January 13, 2004 10:35 AM
To: general@xml.apache.org
Subject: Re: How do I parse a DTD in Java?

That's because a DTD is NOT XML.

At 09:54 AM 1/13/2004, you wrote:
>How do I parse a DTD into an in-memory Java object, like a TreeMap or 
>perhaps some XML specific collection class?
>
>I asked this question the other day, and got an answer that the 
>DocumentBuilder parse method should handle the parsing of a DTD - since
a 
>DTD IS XML.
>
>However, I get basic parsing errors when inputting a simple DTD.  The
code 
>works fine on XML documents, but not on DTDs.  The code I'm using to
parse 
>the DTD looks like this:
>
>      public static void main( String argArgs[] ) {
>      {
>             File dtdFile = new File( "C:\\APIS\\WorkSpace\\tv.dtd" );
>             DocumentBuilderFactory dbf = 
> DocumentBuilderFactory.newInstance();
>             DocumentBuilder db = dbf.newDocumentBuilder();
>             Document document = db.parse( dtdFile );
>             parseChildrenRecursivly( document.getChildNodes();  );
>      }
>
>      public void parseChildrenRecursivly( NodeList argNodeList ) {
>
>             if (argNodeList == null) {
>                 return;
>             }
>
>             Node node;
>             for (int i=0; i<argNodeList.getLength(); i++) {
>                  node = argNodeList.item( i );
>                  if (node.getNodeType() != Node.TEXT_NODE) {
>                      System.out.println(
>                         "node.nodeName = " + node.getNodeName() + "; "

> +
>                         "node.nodeType = " + Short.toString( 
> node.getNodeType() ) + "; " +
>                         "node.localName = " + node.getLocalName() + ";
" +
>                         "node.namespaceUri = " +
node.getNamespaceURI() + 
> "; " +
>                         "node.nodeValue = " + node.getNodeValue() + ";
" +
>                         ""
>                      );
>                      parseChildrenRecursivly( node.getChildNodes() );
>                  }
>             } // for
>      }
>
>However, I get errors when making the attempt:
>
>[Fatal Error] :-1:-1: Premature end of file.
>ERR:> Exception Premature end of file.
>
>The DTD I'm trying to parse is just an example.  It looks like this,
where 
>the elements are embedded inside the DOCTYPE tag:
>
><!DOCTYPE TVSCHEDULE [
>
><!ELEMENT TVSCHEDULE (CHANNEL+)>
><!ELEMENT CHANNEL (BANNER, DAY+)>
><!ELEMENT BANNER (#PCDATA)>
><!ELEMENT DAY ((DATE, HOLIDAY) | (DATE, PROGRAMSLOT+))+>
><!ELEMENT HOLIDAY (#PCDATA)>
><!ELEMENT DATE (#PCDATA)>
><!ELEMENT PROGRAMSLOT (TIME, TITLE, DESCRIPTION?)>
><!ELEMENT TIME (#PCDATA)>
><!ELEMENT TITLE (#PCDATA)>
><!ELEMENT DESCRIPTION (#PCDATA)>
>
><!ATTLIST TVSCHEDULE NAME CDATA #REQUIRED>
><!ATTLIST CHANNEL CHAN CDATA #REQUIRED>
><!ATTLIST PROGRAMSLOT VTR CDATA #IMPLIED>
><!ATTLIST TITLE RATING CDATA #IMPLIED>
><!ATTLIST TITLE LANGUAGE CDATA #IMPLIED>
>
>]>
>
>If I just parse the ELEMENTS, by removing the DOCTYPE tag, I still get
errors:
>
>Exception The markup in the document preceding the root element must be

>well-formed.
>[Fatal Error] tv.dtd:3:3: The markup in the document preceding the root

>element must be well-formed.
>
>Anybody have a clue how to parse a DTD, so I can get an in-memory 
>structure of the DTD in Java?
>
>
>
>----------
>Scott Andrews
>Principle Software Engineer
>Concurrent Technologies Corporation
>(814) 269 6580 (Monday, Wednesday, Friday)
>(814) 632 9559 (Tuesday, Thursday)
>(814) 880 8522 (Cell)
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message