cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Victor Smirnov" <>
Subject Re: Remarks on i18n
Date Thu, 03 Feb 2000 12:59:58 GMT

>Victor Smirnov wrote:
>> Hello!
>> My first message was just to point out the problem.
>> Here I try to put the sugestion for improvement.
>> Let's look at the tag
>> <?cocoon-format type="xxx/yyy"?>
>> We can include also the charset. This will be
>> <?cocoon-format type="xxx/yyy" charset="zzz"?>
>> The default charset can be set in config file (
>> formater.charset = Cp1251
>Hmm. Do you mean that charset in document is charset for document
>and charset for formatter is for output charset? Or output is always utf-8?
>utf 8 output is not suitable for me.
Yes, but the default is ISO-8859-5.

Let me explain my thoughts a bit more. The output, which is send to the
client is byte array,
while String in Java is Unicode array. After all, formater produces the
String which
can be converted to bytes in different ways.
For instance
PrintWriter out = res.getWriter();
Then the result is converted to Cp1251. We can put
and the string will be converted to KOI.
By default it is converted to ISO-8859-1.  And symbols that can't be
converted are
replaced with '?'

My proposal is to let user specify the charset to send result to client. It
can be set globaly
to all site and all formaters or individualy in the files.

The other two questions are how to setup producers and processors?

I suppose that FileProducer get encoding from xml-declaration
(<?xml version="1.0" encoding="ISO-8859-5"?>)
and/or from property file.encoding
(java -Dfile.encoding=ISO-8859-5)
At this moment I don't know this.

Processor are the other question. Let's take SQLProcessor. It retrieves
data. Ok, what is
the data encoding? As for me, I don't want my Oracle driver to convert
russian text
into &...;... How can I set this?

- Victor

View raw message