xerces-c-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alberto Massari <amass...@datadirect.com>
Subject Re: SAXParseException, wrong line and column numbers
Date Mon, 21 Sep 2009 07:28:40 GMT
This is a known problem in Xerces 
(http://issues.apache.org/jira/browse/XERCESC-1288); the invalid byte is 
detected by the transcoder when reading a new chunk of data, but before 
the good data is processed. So the last known position is distant from 
the error location.
The fix could be returning the data that is valid, and report the error 
only when the bad data at the beginning of the chunk.

Alberto

Boris Kolpackov wrote:
> Hi Igor,
>
> Igor Ignatyuk <igor_ignatiouk@hotmail.com> writes:
>
>   
>> I am parsing the next file that is encoded as Windows-1252:
>>
>> <?xml version="1.0" ?>
>> <test>ä</test>
>>
>> The implicit XML encoding is UTF-8, therefore it is correct that I get a
>> parsing error, but SAXParseException::getLineNumber and
>> SAXParseException::getColumnNumber return wrong values:
>>
>> line 1, column 23: invalid byte '<' at position 2 of a 3-byte sequence
>>
>> IMHO the line number should be 2, the column number - 8 (position of the
>> character '<') or 7 (position of the character 'ä').
>>     
>
> Can you submit a bug report for this problem and attach the sample
> XML to it (I cannot reproduce the problem by copying and pasting
> the XML fragment from the email):
>
> http://xerces.apache.org/xerces-c/bug-report.html
>
>
> Thanks,
> 	Boris
>
>   


Mime
View raw message