abdera-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Diephouse <dan.diepho...@mulesource.com>
Subject Re: Invalid byte 2 of 3-byte UTF-8 sequence.
Date Tue, 04 Sep 2007 16:16:29 GMT
I have no idea whats causing this error, but I'm highly doubting its 
woodstox. Woodstox is the most highly conformant xml parser out there. 
(but I could be wrong)

I would strongly suggest avoiding using 2.0.5 though for a number of reasons
- 3.x has many stax conformance improvements. AXIOM hasn't really been 
tested with 2.x and it expects the stax api to react a certain way
- 3.x is faster
- 3.x has improved xml conformance

I stepped through the test case a little and wasn't able to see what was 
going right away. I would need to get the AXIOM sources to really dig in 
more - I suspect the bug might lie in there after a little bit of 
digging, but that is because thats the place I haven't looked yet.

Any chance you could catch the message being sent from the server with 
something like TCPMon and post it to the JIRA issue?

- Dan

Chris Berry wrote:
> That fixes it!!!
>
> I modified all of the pertinent POMs accordingly;
> I.e.
> <!--
>       <dependency>
>         <groupId>org.codehaus.woodstox</groupId>
>         <artifactId>wstx-asl</artifactId>
>         <version>3.2.1</version>
>         <scope>runtime</scope>   
>       </dependency>
> -->
>       <dependency>
>         <groupId>woodstox</groupId>
>         <artifactId>wstx-asl</artifactId>
>         <version>2.0.5</version>
>         <scope>runtime</scope>   
>       </dependency>
>
> 9 POMs were affected::
>
> dogstar:~/java/abdera/svn-head-using-old-woostox/trunk cberry$ find . 
> -name "*.xml" | xargs grep woodstox
> ./extensions/gdata/pom.xml:      <groupId>org.codehaus.woodstox</groupId>
> ./extensions/geo/pom.xml:      <groupId>org.codehaus.woodstox</groupId>
> ./extensions/json/pom.xml:      <groupId>org.codehaus.woodstox</groupId>
> ./extensions/main/pom.xml:      <groupId>org.codehaus.woodstox</groupId>
> ./extensions/media/pom.xml:      <groupId>org.codehaus.woodstox</groupId>
> ./extensions/opensearch/pom.xml:      
> <groupId>org.codehaus.woodstox</groupId>
> ./extensions/sharing/pom.xml:      
> <groupId>org.codehaus.woodstox</groupId>
> ./parser/pom.xml:      <groupId>org.codehaus.woodstox</groupId>
> ./pom.xml:        <groupId>org.codehaus.woodstox</groupId>
>
> I will add this info to the JIRA.
>
> James,
> Can we move the SVN Head back to 2.0.5 until this is resolved??
>
> FYI: we are using woodstox 3.2.1 in another project with these exact 
> same XMLs without problem??
>
> Thanks much,
> -- Chris
> On Sep 4, 2007, at 10:04 AM, Chris Berry wrote:
>
>> I will try that. I didn't before, because I wasn't sure that the it 
>> wasn't required somehow internally...
>>
>> BTW: I ran these XML documents with the supposed invalid chars thru 2 
>> different UTF-8 conversions as I read them from disk, before putting 
>> them into the <content>
>> And I also processed them with the Unix "iconv" utility
>> So I am pretty darn sure that there are no invalid chars in there.
>>
>> Cheers,
>> -- Chris
>> On Sep 4, 2007, at 9:26 AM, James M Snell wrote:
>>
>>> Well, FWIW, there are no changes in Abdera 0.3.0 that *require* the new
>>> version of woodstox.  If dropping down to an older version addresses 
>>> the
>>> issue, then we can explore that as a solution.
>>>
>>> - James
>>>
>>> Chris Berry wrote:
>>>> Hmmm.
>>>> FYI:  I saw a similar problem with an earlier 0.3. I was mixing the
>>>> latest woodstox with Abdera
>>>> Or more correctly, maven was bringing in some chained dependencies --
>>>> one of which brought in woodstox 3.2.1.
>>>> Abdera was using woodstox 2.0.5 at that time.
>>>> The problem went away when I corrected this problem....
>>>>
>>>> Note, if this is your problem, you can workaround it with the maven
>>>> <exclusions> element
>>>> e.g.
>>>>         <dependency>
>>>>           <groupId>com.whatever</groupId>
>>>>           <artifactId>foo</artifactId>
>>>>           <version>1.2.3</version>
>>>>           <exclusions>
>>>>             <exclusion>
>>>>               <groupId>org.codehaus.woodstox</groupId>
>>>>               <artifactId>wstx-lgpl</artifactId>
>>>>             </exclusion>
>>>>           </exclusions>
>>>>         </dependency>
>>>>
>>>> BTW: this is why I suspect that the Abdera 0.3 UTF-8 issue is 
>>>> related to
>>>> the woodstox upgrade....
>>>>
>>>> Cheers,
>>>> -- Chris
>>>>
>>>> On Sep 4, 2007, at 8:59 AM, Iops@gmx.de wrote:
>>>>
>>>>> Hi Chris!
>>>>>
>>>>> Thanks for your feedback!
>>>>>
>>>>>> This is exactly the bug I am seeing.
>>>>>> AFAICT, it is not related to a missing <?xml version="1.0"
>>>>>> encoding="UTF-8"?>,
>>>>>> Incidentally, my code worked fine before a recent "svn up" and it

>>>>>> has
>>>>>> no <?xml version="1.0" encoding="UTF-8"?>,
>>>>>
>>>>> If I understand your problem correctly, it occurs, if you parse an
>>>>> entry with an AbderaClient (i.e. calling "entry.getContent()"), 
>>>>> right?
>>>>>
>>>>> Mine occurs, if I use an AbderaClient to create an entry on an
>>>>> external server, which is btw a proprietary closed-source-thingi. The
>>>>> server then gives me the error-message, while he tries to parse my
>>>>> request.
>>>>>
>>>>>> It seems that knowing that another person is seeing the issue
>>>>>> confirms that the issue is on Abdera's side...
>>>>>
>>>>> I'm not sure, if we both encounter the same problem. My problem 
>>>>> occurs
>>>>> also with the AbderaClient 0.22. Yours occured only after updating to
>>>>> 0.30-snapshot, right?
>>>>>
>>>>> I haven't the slightest idea, whether the problem lies in my code, in
>>>>> the abdera-code or even in the server-code.
>>>>>
>>>>> My next test would be the creation of an atom-entry by hand without
>>>>> the AbderaClient and provide an "<?xml version="1.0"
>>>>>> encoding="UTF-8"?>" to check how the server reacts.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Herbert
>>>>>
>>>>>
>>>>> -- 
>>>>> GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
>>>>> Alle Infos und kostenlose Anmeldung: 
>>>>> http://www.gmx.net/de/go/freemail
>>>>
>>>> S'all good  ---   chriswberry at gmail dot com
>>>>
>>>>
>>>>
>>>>
>>
>> S'all good  ---   chriswberry at gmail dot com
>>
>>
>>
>
> S'all good  ---   chriswberry at gmail dot com
>
>
>
>


-- 
Dan Diephouse
MuleSource
http://mulesource.com | http://netzooid.com/blog


Mime
View raw message