abdera-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James M Snell <jasn...@gmail.com>
Subject Re: Invalid byte 2 of 3-byte UTF-8 sequence.
Date Tue, 04 Sep 2007 14:10:10 GMT
The woodstox parser is very strict when it comes to UTF-8 handling.  If
there are any bad characters at all, it'll simply die. Unfortunately,
mixing encodings in files is actually more common than you'd think and
it would be a good idea for us to provide a java.io.Reader
implementation that can scan for and correct (within reason) erroneous
encodings on the fly.

- James

Iops@gmx.de wrote:
> Hi Chris!
> 
> Thanks for your feedback!
> 
>> This is exactly the bug I am seeing.
>> AFAICT, it is not related to a missing <?xml version="1.0"  
>> encoding="UTF-8"?>,
>> Incidentally, my code worked fine before a recent "svn up" and it has  
>> no <?xml version="1.0" encoding="UTF-8"?>,
> 
> If I understand your problem correctly, it occurs, if you parse an entry with an AbderaClient
(i.e. calling "entry.getContent()"), right?
> 
> Mine occurs, if I use an AbderaClient to create an entry on an external server, which
is btw a proprietary closed-source-thingi. The server then gives me the error-message, while
he tries to parse my request.
>  
>> It seems that knowing that another person is seeing the issue  
>> confirms that the issue is on Abdera's side...
> 
> I'm not sure, if we both encounter the same problem. My problem occurs also with the
AbderaClient 0.22. Yours occured only after updating to 0.30-snapshot, right?
> 
> I haven't the slightest idea, whether the problem lies in my code, in the abdera-code
or even in the server-code.
>  
> My next test would be the creation of an atom-entry by hand without the AbderaClient
and provide an "<?xml version="1.0"  
>> encoding="UTF-8"?>" to check how the server reacts.
> 
> Regards,
> 
> Herbert
> 
> 

Mime
View raw message