cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Stevens (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COCOON-2002) HTML transformer only works with latin-1 characters
Date Thu, 08 Feb 2007 11:37:05 GMT

    [ https://issues.apache.org/jira/browse/COCOON-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471290
] 

Andrew Stevens commented on COCOON-2002:
----------------------------------------

Just a thought - won't the encoding that needs to be used depend on what was used in the input
document?  i.e. if the source document passed in from a file generator has <?xml version="1.0"
encoding="Big5"?>, would the above change cause similar problems?

Also, how is this affected by the char-encoding property in the tidy.properties configuration
file?  Rather than the above change, could you have solved your problem by ensuring that property
matches the source encoding being used in your documents?  It may be that jtidy's default
is latin-1.

It seems to me that passing the above value in to the getBytes call assumes that the AbstractSAXTransformer's
text recording code is written to always use UTF-8 for the stored text (and transcode where
necessary).  Is this actually the case?


> HTML transformer  only works with latin-1 characters
> ----------------------------------------------------
>
>                 Key: COCOON-2002
>                 URL: https://issues.apache.org/jira/browse/COCOON-2002
>             Project: Cocoon
>          Issue Type: Bug
>          Components: Blocks: HTML
>    Affects Versions: 2.1.10, 2.1.11-dev (Current SVN)
>            Reporter: Abbas Mousavi
>            Priority: Critical
>
> when transforming HTML in encodings other than latin-1
> the result is a page of question mark.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message