xerces-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Foster <...@objfac.com>
Subject Re: going crazy with this: org.xml.sax.SAXParseException: Content is not allowed in prolog
Date Sun, 31 Jul 2005 20:19:57 GMT
So this is a bug report? Parser is not skipping past BOM.

Odd I haven't seen that.

Bob Foster
http://xmlbuddy.com/

Martin Vysny wrote:
> Robert Houben wrote:
> 
>> This may not be your problem, but I've wasted tons of time in the past
>> because of these symptoms, so here is why it happened to me...
>>
>> I have seen this happen when a file is read that contains byte order
>> marks at the beginning.  Most editors strip these out and get the
>> encoding right, so you don't know this is happening.  If you are doing
>> your own file reader to get an InputStream, you may need to skip a few
>> bytes at the beginning, setting the encoding value correctly based on
>> them, prior to setting up the reader. To tell if this is happening to
>> you, on a windows system, use the debug.exe command from the command
>> line:
>>
>> C:\>debug test.xml
>> -d
>> 1480:0100  FF FE 3C 00 74 00 65 00-73 00 74 00 3E 00 74 00
>> ..<.t.e.s.t.>.t.
>> 1480:0110  65 00 73 00 74 00 3C 00-2F 00 74 00 65 00 73 00
>> e.s.t.<./.t.e.s.
>> 1480:0120  74 00 3E 00 0D 00 0A 00-00 00 00 00 00 00 00 00
>> t.>.............
>> 1480:0130  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> ................
>> 1480:0140  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> ................
>> 1480:0150  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> ................
>> 1480:0160  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> ................
>> 1480:0170  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00
>> ................
>> -q
>>
>> C:\>
>>
>> Note that the file starts with "FFFE" which is a Unicode 16 Little
>> Endian byte order mark (BOM).  If you create your own file reader and
>> try to pull this in, you will encounter the error that you are
>> mentioning.  Notepad will show this as normal text, you'll never see the
>> funny stuff.
>>
>> HTH,
> 
> 
> I had the same problem aswell. When you try to save file in notepad.exe 
> as UTF-8, it places 3-byte invisible UTF-8 character at the start of xml 
> file. That is causing that goddamn "Content is not allowed in prolog" 
> message.


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Mime
View raw message