incubator-any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (Commented) (JIRA)" <>
Subject [jira] [Commented] (ANY23-12) character are wrongly encoded in rdfxml output
Date Wed, 12 Oct 2011 19:15:11 GMT


Lewis John McGibbney commented on ANY23-12:

Brief exploration:

1. The attached file is indeed utf-8 encoded and correctly marked as such in the header

2. On the command line, parsing and re-serializing it with "any23 -f rdfxml" produces a correctly
utf-8 encoded file, no encoding problems

3. I uploaded a copy of the file here:

4. Parsing and re-serializing this uploaded file with produces a correctly utf-8
encoded response, no encoding problems:

5. Copy-pasting the file's contents into the textarea on produces a broken double
utf-8 encoded response, as indicated by the reporter

So the problem seems to be related to the processing of a submitted textarea.

Hypothesis, without having looked at the any23 servlet's code: the textarea's content is correctly
submitted and sent over the wire as utf-8, but the servlet messes up the encoding before sending
it to the any23 parser.

This seems relevant:

It states that by default, POST bodies are assumed to be ISO-8859-1. It can be overridden
by setting Content-Type on the HTTP request, but most browsers don't do that when submitting
form posts, so it doesn't appear to be an option. The solution proposed there is to include
a filter before the servlet that fixes the encoding. Apparently, ready-made code for doing
that could be lifted from Tomcat.
> character are wrongly encoded in rdfxml output 
> -----------------------------------------------
>                 Key: ANY23-12
>                 URL:
>             Project: Apache Any23
>          Issue Type: Bug
>            Reporter: Lewis John McGibbney
>         Attachments: Soldering_iron_test.rdf
> What steps will reproduce the problem?
> 1. open file Soldering_iron_test.rdf in your browser see that all characters are displayed
> espacially look for all rdfs:label  in different languages  
> 2. go to
> 3. copy the file content into content form 
> 4. set output to rdfxml 
> What is the expected output? What do you see instead?
> In the output rdfxml rdfs:label's are wrongly encoded   
> What version of the product are you using?
> Please provide any additional information below.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message