Return-Path: Delivered-To: apmail-incubator-abdera-user-archive@locus.apache.org Received: (qmail 29064 invoked from network); 4 Sep 2007 22:13:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Sep 2007 22:13:57 -0000 Received: (qmail 27205 invoked by uid 500); 4 Sep 2007 22:13:52 -0000 Delivered-To: apmail-incubator-abdera-user-archive@incubator.apache.org Received: (qmail 27189 invoked by uid 500); 4 Sep 2007 22:13:52 -0000 Mailing-List: contact abdera-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: abdera-user@incubator.apache.org Delivered-To: mailing list abdera-user@incubator.apache.org Received: (qmail 27180 invoked by uid 99); 4 Sep 2007 22:13:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Sep 2007 15:13:52 -0700 X-ASF-Spam-Status: No, hits=3.0 required=10.0 tests=FB_WORD1_END_DOLLAR,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chriswberry@gmail.com designates 64.233.166.181 as permitted sender) Received: from [64.233.166.181] (HELO py-out-1112.google.com) (64.233.166.181) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Sep 2007 22:13:45 +0000 Received: by py-out-1112.google.com with SMTP id u77so6304466pyb for ; Tue, 04 Sep 2007 15:13:24 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:in-reply-to:references:mime-version:content-type:message-id:cc:from:subject:date:to:x-mailer; b=D4Dd41o6sj/lSZLPBtLY6IyxQ8hEByqElmhWgeAFgsFVsqfdaP5dlAd/h6GCdfAptGwvUBJddi2+PjTLEbEa2TrGpYtFMLOoitRLNdt+ntasIYTT6VxI/UiPrMqFaDH440Fj/e7xbw4aoIIGX09vYHAPi4WB3TpK4j2x6jnLYLk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:in-reply-to:references:mime-version:content-type:message-id:cc:from:subject:date:to:x-mailer; b=Vf/wPDAxPpeIkyUK2siIzOMMBN6qh5XVaBqsWwvmEw/3jaH6jslos0P7EiVGAbZJbmKGAx/HzhlFKgB8/eIP8e81szDo4fZM/sMUYYFD9vxDXGB7x+59hQTyW2S0+JfkhEUnViR6eAi90AnKawmCZgPVUAI0lqLUChoHS8D/IvM= Received: by 10.35.91.1 with SMTP id t1mr7894540pyl.1188944003516; Tue, 04 Sep 2007 15:13:23 -0700 (PDT) Received: from ?192.168.11.252? ( [70.249.74.9]) by mx.google.com with ESMTPS id p57sm4119423pyb.2007.09.04.15.13.11 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 04 Sep 2007 15:13:15 -0700 (PDT) In-Reply-To: <2A614BEF-6D89-439E-96C5-CCADB6D61ADB@gmail.com> References: <20070904122343.116970@gmx.net> <9043128E-0480-4BB1-AAEB-B74129A3E253@gmail.com> <46DD84DD.9@mulesource.com> <28338055-A15D-483E-B1B6-1BDF9D64D36D@gmail.com> <3396B30F-DA8B-4CAA-92AB-6090B66DA3B0@gmail.com> <46DDAFA8.2050501@gmail.com> <9144448B-D59B-4994-B0A0-009F95C81C2C@gmail.com> <46DDB86B.1070505@gmail.com> <5963D521-9A5B-4CAC-86A5-08D3CEF98DEF@gmail.com> <2A614BEF-6D89-439E-96C5-CCADB6D61ADB@gmail.com> Mime-Version: 1.0 (Apple Message framework v752.3) Content-Type: multipart/alternative; boundary=Apple-Mail-2-183358345 Message-Id: Cc: abdera-user@incubator.apache.org From: Chris Berry Subject: Re: Invalid byte 2 of 3-byte UTF-8 sequence. Date: Tue, 4 Sep 2007 17:12:35 -0500 To: Chris Berry X-Mailer: Apple Mail (2.752.3) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-2-183358345 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Doh. Sorry you said that ... On Sep 4, 2007, at 5:11 PM, Chris Berry wrote: > What JDK are you using?? > > On Sep 4, 2007, at 5:08 PM, Stephen Duncan wrote: > >> I don't see the problem (i.e. I changed the one assertion you >> mentioned in >> the comments & commented out the abdera-extensions dependency that >> doesn't >> exist anymore, and the test passed). I'm using: >> >> java version "1.6.0" >> Java(TM) SE Runtime Environment (build 1.6.0-b105) >> Java HotSpot(TM) Server VM (build 1.6.0-b105, mixed mode) >> >> On Kubuntu. >> >> -Stephen >> >> On 9/4/07, Chris Berry wrote: >>> >>> hmmmm. >>> The Sun vs IBM JDK is worth a try... >>> >>> On Sep 4, 2007, at 2:56 PM, James M Snell wrote: >>> >>>> Heh.. figures, one platform I can't test. I can confirm that I >>>> am not >>>> seeing this error at all on Windows or Ubuntu using the IBM JDK and >>>> Woodstox or the WAS stax parser. I haven't tried the Sun JDK yet. >>>> >>>> - James >>>> >>>> Chris Berry wrote: >>>>> Macbook Pro -- MAC OS-X 10.3 >>>>> >>>>> dogstar:~/homeaway/pstore/working-NewAbdera-test cberry$ uname -a >>>>> Darwin dogstar.local 8.10.1 Darwin Kernel Version 8.10.1: Wed >>>>> May 23 >>>>> 16:33:00 PDT 2007; root:xnu-792.22.5~1/RELEASE_I386 i386 i386 >>>>> >>>>> dogstar:~/homeaway/pstore/working-NewAbdera-test cberry$ java - >>>>> version >>>>> java version "1.5.0_07" >>>>> Java(TM) 2 Runtime Environment, Standard Edition (build >>>>> 1.5.0_07-164) >>>>> Java HotSpot(TM) Client VM (build 1.5.0_07-87, mixed mode, >>>>> sharing) >>>>> >>>>> Thanks, >>>>> -- Chris >>>>> >>>>> On Sep 4, 2007, at 2:19 PM, James M Snell wrote: >>>>> >>>>>> Hmmm... well, I ran your test cases and have not been able to >>>>>> recreate >>>>>> the issue at all. I'm running on Ubuntu with the IBM JDK 1.5, >>>>>> tried the >>>>>> latest woodstox and the stax parser that ships with Websphere, >>>>>> and was >>>>>> completely unable to get the test to throw any kind of UTF-8 >>>>>> related >>>>>> errors. >>>>>> >>>>>> What operating system are you testing on? What JDK? >>>>>> >>>>>> - James >>>>>> >>>>>> Chris Berry wrote: >>>>>>> I added the following JUnit (to the JIRA), which I think proves >>>>>>> that >>>>>>> woodstox 3.2.1 is not the issue. >>>>>>> It passes fine (no Exceptions thrown). >>>>>>> So (AFAICT) the issue is somewhere else (Abdera or Axiom??) >>>>>>> Cheers, >>>>>>> -- Chris >>>>>>> =================================== >>>>>>> package com.homeaway.hcdata.store.provider.blogs; >>>>>>> >>>>>>> import junit.framework.Test; >>>>>>> import junit.framework.TestCase; >>>>>>> import junit.framework.TestSuite; >>>>>>> >>>>>>> import javax.xml.stream.XMLStreamReader; >>>>>>> import javax.xml.stream.XMLInputFactory; >>>>>>> >>>>>>> import java.io.FileInputStream; >>>>>>> >>>>>>> import com.ctc.wstx.stax.WstxInputFactory; >>>>>>> >>>>>>> public class WoodstoxTest extends TestCase { >>>>>>> >>>>>>> private static final String userdir = System.getProperty( >>>>>>> "user.dir" ); >>>>>>> >>>>>>> public static Test suite() >>>>>>> { return new TestSuite( WoodstoxTest.class ); } >>>>>>> >>>>>>> public void tearDown() throws Exception >>>>>>> { super.tearDown(); } >>>>>>> >>>>>>> public void setUp() throws Exception >>>>>>> { super.tearDown(); } >>>>>>> >>>>>>> public void testWoodstox() throws Exception { >>>>>>> >>>>>>> String filename = userdir + >>>>>>> "/var/blogs/cberry/99/9999/en/blog_9999.xml" ; >>>>>>> >>>>>>> // we sill simply walk the doc and see if it throws an >>>>>>> Exception >>>>>>> XMLInputFactory xif = new WstxInputFactory(); >>>>>>> XMLStreamReader r = xif.createXMLStreamReader(new >>>>>>> FileInputStream( filename )); >>>>>>> while (r.hasNext()) r.next(); >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> >>>>>>> On Sep 4, 2007, at 12:18 PM, Chris Berry wrote: >>>>>>> >>>>>>>> On Sep 4, 2007, at 11:16 AM, Dan Diephouse wrote: >>>>>>>> I have no idea whats causing this error, but I'm highly >>>>>>>> doubting its >>>>>>>> woodstox. Woodstox is the most highly conformant xml parser out >>>>>>>> there. >>>>>>>> (but I could be wrong) >>>>>>>> >>>>>>>> I would strongly suggest avoiding using 2.0.5 though for a >>>>>>>> number of >>>>>>>> reasons >>>>>>>> - 3.x has many stax conformance improvements. AXIOM hasn't >>>>>>>> really been >>>>>>>> tested with 2.x and it expects the stax api to react a >>>>>>>> certain way >>>>>>>> - 3.x is faster >>>>>>>> - 3.x has improved xml conformance >>>>>>>> >>>>>>>> I stepped through the test case a little and wasn't able to see >>>>>>>> what >>>>>>>> was going right away. I would need to get the AXIOM sources to >>>>>>>> really >>>>>>>> dig in more - I suspect the bug might lie in there after a >>>>>>>> little bit >>>>>>>> of digging, but that is because thats the place I haven't >>>>>>>> looked yet. >>>>>>>> >>>>>>>> Any chance you could catch the message being sent from the >>>>>>>> server with >>>>>>>> something like TCPMon and post it to the JIRA issue? >>>>>>>> >>>>>>>> - Dan >>>>>>>> >>>>>>>> Chris Berry wrote: >>>>>>>> That fixes it!!! >>>>>>>> >>>>>>>> I modified all of the pertinent POMs accordingly; >>>>>>>> I.e. >>>>>>>> >>>>>>>> >>>>>>>> woodstox >>>>>>>> wstx-asl >>>>>>>> 2.0.5 >>>>>>>> runtime >>>>>>>> >>>>>>>> 9 POMs were affected:: >>>>>>>> >>>>>>>> dogstar:~/java/abdera/svn-head-using-old-woostox/trunk cberry$ >>>>>>>> find . >>>>>>>> -name "*.xml" | xargs grep woodstox >>>>>>>> ./extensions/gdata/pom.xml: >>>>>>>> org.codehaus.woodstox >>>>>>>> ./extensions/geo/pom.xml: org.codehaus.woodstox>>>>>>> groupId> >>>>>>>> ./extensions/json/pom.xml: >>>>>>>> org.codehaus.woodstox>>>>>>> groupId> >>>>>>>> ./extensions/main/pom.xml: >>>>>>>> org.codehaus.woodstox>>>>>>> groupId> >>>>>>>> ./extensions/media/pom.xml: >>>>>>>> org.codehaus.woodstox >>>>>>>> ./extensions/opensearch/pom.xml: >>>>>>>> org.codehaus.woodstox >>>>>>>> ./extensions/sharing/pom.xml: >>>>>>>> org.codehaus.woodstox >>>>>>>> ./parser/pom.xml: org.codehaus.woodstox >>>>>>>> ./pom.xml: org.codehaus.woodstox >>>>>>>> >>>>>>>> I will add this info to the JIRA. >>>>>>>> >>>>>>>> James, >>>>>>>> Can we move the SVN Head back to 2.0.5 until this is resolved?? >>>>>>>> >>>>>>>> FYI: we are using woodstox 3.2.1 in another project with these >>>>>>>> exact >>>>>>>> same XMLs without problem?? >>>>>>>> >>>>>>>> Thanks much, >>>>>>>> -- Chris >>>>>>>> On Sep 4, 2007, at 10:04 AM, Chris Berry wrote: >>>>>>>> >>>>>>>> I will try that. I didn't before, because I wasn't sure that >>>>>>>> the it >>>>>>>> wasn't required somehow internally... >>>>>>>> >>>>>>>> BTW: I ran these XML documents with the supposed invalid chars >>>>>>>> thru 2 >>>>>>>> different UTF-8 conversions as I read them from disk, before >>>>>>>> putting >>>>>>>> them into the >>>>>>>> And I also processed them with the Unix "iconv" utility >>>>>>>> So I am pretty darn sure that there are no invalid chars in >>>>>>>> there. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> -- Chris >>>>>>>> On Sep 4, 2007, at 9:26 AM, James M Snell wrote: >>>>>>>> >>>>>>>> Well, FWIW, there are no changes in Abdera 0.3.0 that *require* >>>>>>>> the new >>>>>>>> version of woodstox. If dropping down to an older version >>>>>>>> addresses the >>>>>>>> issue, then we can explore that as a solution. >>>>>>>> >>>>>>>> - James >>>>>>>> >>>>>>>> Chris Berry wrote: >>>>>>>> Hmmm. >>>>>>>> FYI: I saw a similar problem with an earlier 0.3. I was mixing >>>>>>>> the >>>>>>>> latest woodstox with Abdera >>>>>>>> Or more correctly, maven was bringing in some chained >>>>>>>> dependencies -- >>>>>>>> one of which brought in woodstox 3.2.1. >>>>>>>> Abdera was using woodstox 2.0.5 at that time. >>>>>>>> The problem went away when I corrected this problem.... >>>>>>>> >>>>>>>> Note, if this is your problem, you can workaround it with the >>>>>>>> maven >>>>>>>> element >>>>>>>> e.g. >>>>>>>> >>>>>>>> com.whatever >>>>>>>> foo >>>>>>>> 1.2.3 >>>>>>>> >>>>>>>> >>>>>>>> org.codehaus.woodstox >>>>>>>> wstx-lgpl >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> BTW: this is why I suspect that the Abdera 0.3 UTF-8 issue is >>>>>>>> related to >>>>>>>> the woodstox upgrade.... >>>>>>>> >>>>>>>> Cheers, >>>>>>>> -- Chris >>>>>>>> >>>>>>>> On Sep 4, 2007, at 8:59 AM, Iops@gmx.de >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Chris! >>>>>>>> >>>>>>>> Thanks for your feedback! >>>>>>>> >>>>>>>> This is exactly the bug I am seeing. >>>>>>>> AFAICT, it is not related to a missing >>>>>>> encoding="UTF-8"?>, >>>>>>>> Incidentally, my code worked fine before a recent "svn up" and >>>>>>>> it has >>>>>>>> no , >>>>>>>> >>>>>>>> If I understand your problem correctly, it occurs, if you >>>>>>>> parse an >>>>>>>> entry with an AbderaClient (i.e. calling "entry.getContent()"), >>>>>>>> right? >>>>>>>> >>>>>>>> Mine occurs, if I use an AbderaClient to create an entry on an >>>>>>>> external server, which is btw a proprietary closed-source- >>>>>>>> thingi. The >>>>>>>> server then gives me the error-message, while he tries to >>>>>>>> parse my >>>>>>>> request. >>>>>>>> >>>>>>>> It seems that knowing that another person is seeing the issue >>>>>>>> confirms that the issue is on Abdera's side... >>>>>>>> >>>>>>>> I'm not sure, if we both encounter the same problem. My problem >>>>>>>> occurs >>>>>>>> also with the AbderaClient 0.22. Yours occured only after >>>>>>>> updating to >>>>>>>> 0.30-snapshot, right? >>>>>>>> >>>>>>>> I haven't the slightest idea, whether the problem lies in my >>>>>>>> code, in >>>>>>>> the abdera-code or even in the server-code. >>>>>>>> >>>>>>>> My next test would be the creation of an atom-entry by hand >>>>>>>> without >>>>>>>> the AbderaClient and provide an ">>>>>>> encoding="UTF-8"?>" to check how the server reacts. >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Herbert >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS. >>>>>>>> Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/ >>>>>>>> freemail >>>>>>>> >>>>>>>> S'all good --- chriswberry at gmail dot com >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> S'all good --- chriswberry at gmail dot com >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> S'all good --- chriswberry at gmail dot com >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Dan Diephouse >>>>>>>> MuleSource >>>>>>>> http://mulesource.com | http://netzooid.com/blog >>>>>>>> >>>>>>>> >>>>>>>> S'all good --- chriswberry at gmail dot com >>>>>>>> >>>>>>> >>>>>>> S'all good --- chriswberry at gmail dot com >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> S'all good --- chriswberry at gmail dot com >>>>> >>>>> >>>>> >>> >>> S'all good --- chriswberry at gmail dot com >>> >>> >>> >>> >> >> >> -- >> Stephen Duncan Jr >> www.stephenduncanjr.com > > S'all good --- chriswberry at gmail dot com > > > S'all good --- chriswberry at gmail dot com --Apple-Mail-2-183358345--