From Murray Altheim <m.alth...@open.ac.uk>
Subject Re: Content is not allowed in prolog
Date Thu, 12 Feb 2004 10:34:14 GMT
Svgdeveloper@aol.com wrote:
> In a message dated 2/12/2004 10:18:30 AM GMT Standard Time, 
> vladimir.cvetkovic@ericsson.com writes:
>>Hi All,
>>I'm getting the following error message when I try to import a well-formed 
>>and valid xml document in xindice1.1b3 db:
>>[DEBUG] DatabaseImpl - -Using SAX Driver: 'xerces'
>>[DEBUG] DatabaseImpl - -Using Service Location: '/xindice/'
>>[DEBUG] CollectionImpl - -Using URL: 'http://localhost:8080/xindice/'
>>[Fatal Error] :2:1: Content is not allowed in prolog.
>>ERROR : Content is not allowed in prolog.
>>Xindice Command Tools v1.1b3
>>What does it mean?
> Vladimir,
> It would be useful to see the prolog of the document.
> Andrew Watt

In SGML and XML, a document is composed of two sequential parts,
the prolog and the instance. You can see this in an HTML example:

1    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
2             "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
3    <html xmlns="http://www.w3.org/1999/xhtml">
4    <head>
5    <title>The Symbol Grounding Problem</title>
6    </head>
7    <body>
8    </body>
9    </html>

In this example, the prolog is lines 1-2, the instance begins on
line 3. The prolog includes the DOCTYPE declaration, the external
subset (called the DTD), and the internal subset (which you seldom
see but it's legal). The document instance includes the document
element (in this case <html> and all of its descendent content).

You generally don't want to see the prolog, and you generally don't
want to store it. The DOCTYPE declaration provides references to
DTD, which is instantiated as part of the process of validating the
document. You may want to store the reference(s), but you wouldn't
want to store the DTD each time you store the document, as that
would be a real waste (the DTD is often bigger than the document).

It sounds like your well-formed and valid document isn't being
considered as such by the XML processor. The error message indicates
that there is content (i.e., either elements or character data) in
the part of the document considered as the prolog. You may be missing
the last ">" on line 2 above, as that would normally be the beginning
of the internal subset. If it found "<html" (or something similar),
you might get that error.


Murray Altheim                    http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK               .

  "I'm a war president. I make decisions here in the Oval Office
   in foreign policy matters with war on my mind." -- George W. Bush

  "This is the new Mein Kampf. Only Hitler did not have nuclear
   weapons. It's the scariest document I've ever read in my life."
         -- Dr. Helen Caldicott, referring to the Project for the
   New American Century report entitled "Rebuilding America's
   Defenses: Strategy, Forces and Resources For a New Century"

     "This report proceeds from the belief that America should seek
      to preserve and extend its position of global leadership by
      maintaining the preeminence of U.S. military forces." [op. cit.]

     "[...] and advanced forms of biological warfare that can target
      specific genotypes may transform biological warfare from the
      realm of terror to a politically useful tool." [op. cit.]

  "This is a blueprint for US world domination."

