cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Piroumian, Konstantin" <KPiroum...@flagship.ru>
Subject Re: Language Support in Request Parameters
Date Fri, 28 Sep 2001 09:32:32 GMT
 Hi!
>
> You said you had the same problem in Oracle. I'm using MySQL with the MM
> JDBC driver, which expects the normal chars encoding (%xx). Even if I use
> the serializer encoding (which I am about to try now), it will not do the
> trick, it will only try to overcome an inherent problem (if it will work
> at all):
>
> The serializer is the output's last pipe's stage, and it seems that the
> problem is somewhere in the input pipe (meaning from the user to the
> server and within the server). I can even see that the Log file has
> incorrect data.
>
> An interesting point is that when I retrieve data from the database into a
> session argument (using the DBAuthAction), the data is inserted correctly
> - meaning that I see it proper in the Log file and in the resulting page
> (both in plain hebrew). As I looked into the code I saw that the Action
> simply queries the db, and puts the result in the session param (am I
> right, or is there any encoding modification here?). Thus, the results are
> good, and I see my text as intended.

So, the problem is not in the DB and JDBC driver, is it?

>
> However, if the data is provided from the user, then the data gets
> corrupted (reencoded?? where in c2?) - It is the same corrupted data for
> the database, the resulting html and the log file. So, I guess that the
> problem is in the translation of the data. Since Tomcat does not translate
> the data, then (as you said) the problem is with the C2 translation.

Do you use actions to process the user input? Did you try to use
java.net.URLDecoder.decode() before using params?

>
> 1. Where is this translation takes place? (which file in the source)

That depends on your pipeline. Maybe there is no translation at all. What is
your pipeline looks like?

> 1.1 Why do we have this translation?

As far as I remember, this happens, because some servers use 8 byte encoded
HTTP requests and do not correctly interpret Unicode streams. So, browsers
URL-encode all characters above 128 ASCII code into %CC form. The same thing
happens when web server sends the response. Something like that, but I'm not
sure that this all is correct information. See Tomcat documentation and
Servlet specification more info.

> 1.2 C2 pipe model cannot use encoding other than UTF8 ?

I think that it's possible, because either Xerces or Xalan are able to
process documents in different encodings. But I've never tried it, so I
can't help you in this point.

> 1.3 If so, how can I handle data that came FROM the database, and put it
> into the session argument, the html and the log file??

As you said above you don't have problems with it now. Or I get you wrong?

> 2. How did you solve your problem with Oracle? How did you insert proper
> data to the database? (actualy, the problem is not with the oracle or the
> driver but in C2 - I guess that this should be posted as a bug, no?)

The problem was with C1 (not C2) and after that we've changed Oracle DB
encoding to UTF-8 then everything worked as excpeted. But that was about a
year before now...

> 3. Is there a way I can bypass the translation and give it directly to the
> DB action? This way, I won't need to make major changes in C2, and
> everybody will be happy... :-)

I think that you should try to find out where the data is changed. Anyway,
try to decode the parameter.

I don't think that I can provide much help, because I was away from C2 for a
long time and seems that I forgot many details. I'll try to provide an i18n
sample with form data input and simple processing and then I'll be able to
give you more definite answers.

Konstantin

>
> Thanks,
> Udi.
>
>
> On Thu, 27 Sep 2001, Piroumian, Konstantin wrote:
>
> > Hi!
> > See below...
> >
> > >
> > > C2, WNT (hebrew enabled), Tomcat3.2.3, MySQL, IE5.5
> > >
> > > Hey!
> > >
> > > I'm trying to write an application that uses hebrew in forms (meaning
that
> > > the user can insert hebrew chars into form elements, mainly input
boxes).
> > > I guess that the problem is the same in any language which is encoded
into
> > > special html chars.
> > >
> > > I ran a simple application in tomcat (as a simple servlet) and in
cocoon2,
> > > which simply takes the data you entered in an input box (in hebrew),
place
> > >it into a request parameter and then displays the request parameter
from a
> > > different page.
> > >
> > > In the post message, I saw that explorer is coding my chars correct:
> > > POST .....
> > > host: ...
> > >
> > > UserName: %E0%D9%E3....
> > >
> > > When I ran the application on a Tomcat servlet - the results were
good. I
> > > saw the exact (hebrew) chars that I've written before.
> > >
> > > On the C2, however, the parameter did not show up correctly, and was
coded
> > > differently.
> >
> > Maybe you should try to configure your serializer to use the correct
> > encoding?
> > <map:serialize>
> >   <encoding>[HEBREW_ENCODING_NAME]</encoding>
> > </map:serialize>
> >
> > >
> > > The problem is greater when I try to insert data into MySQL db (which
> > > expects the normal %XX encoding) and get garbage there as well.
> >
> > Is it the same garbage that you see on the screen? We had similar
problems
> > with JDBC drivers and Oracle.
> >
> > >
> > >
> > > Did anyone use C2 in an html-encoded language? Can you tell me what I
need
> > > to do to make it work?
> > > Why is Tomcat working and C2 not? Where is the translation being
> > > preformed?
> >
> > I think, that this happens because C2 uses Unicode (UTF-8) encoding for
all
> > internal transformations and Tomcat operated with bytes and does not
perform
> > extra encodings needed in C2.
> >
> > >
> > > Thanks,
> > > Udi.
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > Please check that your question has not already been answered in the
> > > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> > >
> > > To unsubscribe, e-mail: <cocoon-users-unsubscribe@xml.apache.org>
> > > For additional commands, e-mail: <cocoon-users-help@xml.apache.org>
> > >
> >
> > ---------------------------------------------------------------------
> > Please check that your question has not already been answered in the
> > FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
> >
> > To unsubscribe, e-mail: <cocoon-users-unsubscribe@xml.apache.org>
> > For additional commands, e-mail: <cocoon-users-help@xml.apache.org>
> >
>
>
> ---------------------------------------------------------------------
> Please check that your question has not already been answered in the
> FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>
>
> To unsubscribe, e-mail: <cocoon-users-unsubscribe@xml.apache.org>
> For additional commands, e-mail: <cocoon-users-help@xml.apache.org>
>

---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faqs.html>

To unsubscribe, e-mail: <cocoon-users-unsubscribe@xml.apache.org>
For additional commands, e-mail: <cocoon-users-help@xml.apache.org>


Mime
View raw message