cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bazeley, John" <jbaze...@bloomberg.com>
Subject RE: Stream Generator / uploading UTF-8 encoded chinese files
Date Mon, 11 Jul 2005 08:58:59 GMT
Hi all,

It appears to be a consequence of bug 25594. 

I have a workaround involving a hack to StreamGenerator. 

Cheers, John

> -----Original Message-----
> From: JBAZELEY@BLOOMBERG.NET 
> Sent: 08 July 2005 12:12
> To: users@cocoon.apache.org
> Subject: RE: Stream Generator / uploading UTF-8 encoded chinese files
> 
> > Hi,
> > 
> > you can configure the encoding like this :
> > Did you configure the <form-encoding> in web.xml ?
> > Did you try using the action :  setCharacterEncoding (at 
> the start of 
> > you pipeline) ?
> > 
> > Did you open your document with Ultraedit to see what's the 
> encoding ?
> > 
> > 
> > Lionel
> > 
> > 
> > 
> > Bazeley, John wrote:
> > 
> > >Hi all,
> > >
> > >I'm trying to use the stream generator to upload XML files that 
> > >are UTF-8 encoded and contain chinese characters. Source system
> > >is Windows XP and Cocoon is v2.1.7 running on Solaris 9 / Java
> > >1.4.2. Whether I use my own pipeline with curl uploading the file
> > >or the /samples/stream/process-order pipeline, the results are 
> > >the same: the file is returned to me with all the chinese 
> > >characters mangled ('od' shows all the Chinese characters have 
> > >been converted to 357 277 275).
> > >
> > >I have inserted debug into the stream generator and the XML 
> > >serialiser, and both think they are using UTF-8 encoding. 
> > >
> > >Why is my document getting corrupted? What am I doing wrong?
> > >
> > >The source document has 'encoding="UTF-8"' in the <?xml 
> ... string, 
> > >and IE and Firefox both display it correctly and tell me the 
> > encoding 
> > >is UTF-8, so I am inclined to believe the document is correctly 
> > >encoded.
> > >
> > >All suggestions are welcome.
> > >
> > >Thanks, John
> 
> Some more information for the record that I did not post earlier:
> 
> I'm using the version of Jetty that comes bundled with Cocoon 2.1.7 as
> the servlet container.
> Debug has ascertained that the uploaded file gets saved to disk 
> correctly, so the corruption happens some time after that.
> 
> I have updated the servlet jar to 2.3, and that did not make things
> any better.
> 
> My minimal pipeline is:
> 
>     <map:match pattern="john/text">
>       <map:generate type="stream">
>         <map:parameter name="generate-attributes" value="true"/>
>         <map:parameter name="form-name" value="my_xmlfile"/>
>       </map:generate>
>       <map:serialize type="text"/>
>     </map:match>
> 
> and as I stated earlier, the corruption occurs using the 
> sample uploader
> too.
> 
> In my sitemap, I have the text serialiser set to utf-8 thus:
>       <map:serializer logger="sitemap.serializer.text" 
>         mime-type="text/plain" name="text" pool-max="20" 
>         src="org.apache.cocoon.serialization.TextSerializer">
>         <encoding>UTF-8</encoding>
>       </map:serializer>
> 
> Thanks for any help,
> --
> John
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message