Return-Path: Delivered-To: apmail-abdera-dev-archive@www.apache.org Received: (qmail 45604 invoked from network); 26 Mar 2009 01:49:15 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Mar 2009 01:49:15 -0000 Received: (qmail 84557 invoked by uid 500); 26 Mar 2009 01:49:15 -0000 Delivered-To: apmail-abdera-dev-archive@abdera.apache.org Received: (qmail 84536 invoked by uid 500); 26 Mar 2009 01:49:15 -0000 Mailing-List: contact dev-help@abdera.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@abdera.apache.org Delivered-To: mailing list dev@abdera.apache.org Received: (qmail 84494 invoked by uid 99); 26 Mar 2009 01:49:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2009 01:49:15 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jasnell@gmail.com designates 209.85.221.107 as permitted sender) Received: from [209.85.221.107] (HELO mail-qy0-f107.google.com) (209.85.221.107) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2009 01:49:07 +0000 Received: by qyk5 with SMTP id 5so605318qyk.25 for ; Wed, 25 Mar 2009 18:48:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=dBdXj2A3r5EcVMRMVjuFM8OaQOPoqx8hkhwB91qHCHc=; b=eJWO9e2ApWEHhw2L5aEAn3oBmCjbZwzOBdgohjTOYq8CCO5Hk6pm8B7KnMiEqfgbqX 7/gJmTRhxqpc8SHuNTgIzplqU/apS3RnQMEyPADxWhv0PG7gUpu39mJeXClhRLZmfeKy 8BRPyh+3Il87GaHNHkeASo4IH45/aVs/MUGZA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=A3KLiXBX+uNtU5pUWVHTOrTcZ0NLDq4iXxgPvuh7R+fUUS/FAcEiAZei7VknYdcYC3 HI2V2s3XoxTcko67BGrYVl01z0FRLSYuXijequtTQCPk8Xg4veVVFYw8JyDx4VpfuBFn ZJW+uqDY9lU3fFGiyVyiJ+IoV8mz9eKZMUEns= Received: by 10.224.45.204 with SMTP id g12mr385823qaf.170.1238032126604; Wed, 25 Mar 2009 18:48:46 -0700 (PDT) Received: from ?192.168.2.100? (c-98-224-93-96.hsd1.ca.comcast.net [98.224.93.96]) by mx.google.com with ESMTPS id 7sm469549qwf.30.2009.03.25.18.48.42 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 25 Mar 2009 18:48:45 -0700 (PDT) Message-ID: <49CADF10.8000007@gmail.com> Date: Wed, 25 Mar 2009 18:49:04 -0700 From: James M Snell User-Agent: Thunderbird 2.0.0.21 (X11/20090318) MIME-Version: 1.0 To: dev@abdera.apache.org CC: abdera-dev@incubator.apache.org Subject: Re: [jira] Created: (ABDERA-222) Parse failures reading utf-8 xml files that have attribute values that contain non US-ASCII valid utf-8 characters References: <1826815061.1237858490713.JavaMail.jira@brutus> In-Reply-To: <1826815061.1237858490713.JavaMail.jira@brutus> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Interestingly, we spotted a similar problem running Abdera on WebSphere 6.1.0.17 and higher. The problem was fixed by applying a fixpack. I'm not sure if this is an abdera bug or something we need to code defensively for. - James jv ning (JIRA) wrote: > Parse failures reading utf-8 xml files that have attribute values that contain non US-ASCII valid utf-8 characters > ------------------------------------------------------------------------------------------------------------------ > > Key: ABDERA-222 > URL: https://issues.apache.org/jira/browse/ABDERA-222 > Project: Abdera > Issue Type: Bug > Affects Versions: 0.4.0 > Environment: solarix x86_64, MaxOS Leopard x86_64, linux x86_64 > Reporter: jv ning > > > When parsing XML files that are items fetched by http-client 3.1 > > The same items parse correctly, if written to a byte array and then a ByteArrayInputStream on the byte array, is passed to parse. > parser.parse(response.getResponseBodyAsStream()); > > Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character (NULL, unicode 0) encountered: not valid in any content > at [row,col {unknown-source}]: [3,56] > at com.ctc.wstx.sr.StreamScanner.constructNullCharException(StreamScanner.java:615) > at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:644) > at com.ctc.wstx.sr.BasicStreamReader.readTextPrimary(BasicStreamReader.java:4554) > at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2886) > at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) > at org.apache.abdera.parser.stax.FOMBuilder.getNextElementToParse(FOMBuilder.java:163) > at org.apache.abdera.parser.stax.FOMBuilder.next(FOMBuilder.java:187) > >