xerces-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Vysny <vyzi...@host.sk>
Subject Re: going crazy with this: org.xml.sax.SAXParseException: Content is not allowed in prolog
Date Sun, 31 Jul 2005 17:19:14 GMT
Robert Houben wrote:
> This may not be your problem, but I've wasted tons of time in the past
> because of these symptoms, so here is why it happened to me...
> 
> I have seen this happen when a file is read that contains byte order
> marks at the beginning.  Most editors strip these out and get the
> encoding right, so you don't know this is happening.  If you are doing
> your own file reader to get an InputStream, you may need to skip a few
> bytes at the beginning, setting the encoding value correctly based on
> them, prior to setting up the reader. To tell if this is happening to
> you, on a windows system, use the debug.exe command from the command
> line:
> 
> C:\>debug test.xml
> -d
> 1480:0100  FF FE 3C 00 74 00 65 00-73 00 74 00 3E 00 74 00
> ..<.t.e.s.t.>.t.
> 1480:0110  65 00 73 00 74 00 3C 00-2F 00 74 00 65 00 73 00
> e.s.t.<./.t.e.s.
> 1480:0120  74 00 3E 00 0D 00 0A 00-00 00 00 00 00 00 00 00
> t.>.............
> 1480:0130  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
> ................
> 1480:0140  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
> ................
> 1480:0150  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
> ................
> 1480:0160  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
> ................
> 1480:0170  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
> ................
> -q
> 
> C:\>
> 
> Note that the file starts with "FFFE" which is a Unicode 16 Little
> Endian byte order mark (BOM).  If you create your own file reader and
> try to pull this in, you will encounter the error that you are
> mentioning.  Notepad will show this as normal text, you'll never see the
> funny stuff.
> 
> HTH,

I had the same problem aswell. When you try to save file in notepad.exe 
as UTF-8, it places 3-byte invisible UTF-8 character at the start of xml 
file. That is causing that goddamn "Content is not allowed in prolog" 
message.

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Mime
View raw message