From users-return-98988-apmail-cocoon-users-archive=cocoon.apache.org@cocoon.apache.org Mon Oct 25 08:58:59 2010 Return-Path: Delivered-To: apmail-cocoon-users-archive@www.apache.org Received: (qmail 28699 invoked from network); 25 Oct 2010 08:58:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Oct 2010 08:58:59 -0000 Received: (qmail 98962 invoked by uid 500); 25 Oct 2010 08:58:58 -0000 Delivered-To: apmail-cocoon-users-archive@cocoon.apache.org Received: (qmail 98823 invoked by uid 500); 25 Oct 2010 08:58:56 -0000 Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: users@cocoon.apache.org List-Id: Delivered-To: mailing list users@cocoon.apache.org Received: (qmail 98816 invoked by uid 99); 25 Oct 2010 08:58:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Oct 2010 08:58:55 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of svengelska@hotmail.com designates 65.55.34.150 as permitted sender) Received: from [65.55.34.150] (HELO col0-omc3-s12.col0.hotmail.com) (65.55.34.150) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Oct 2010 08:58:49 +0000 Received: from COL124-W42 ([65.55.34.135]) by col0-omc3-s12.col0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 25 Oct 2010 01:58:29 -0700 Message-ID: Content-Type: multipart/alternative; boundary="_ece7f319-8972-4488-a5cb-2318a476b9c8_" X-Originating-IP: [80.217.185.237] From: ". ." To: Subject: RE: System upgrade and now Cocoon is escaping tabs/entities. Date: Mon, 25 Oct 2010 08:58:29 +0000 Importance: Normal In-Reply-To: <4CA34223.4060906@christopherschultz.net> References: ,<4CA23729.1050008@christopherschultz.net> ,<4CA34223.4060906@christopherschultz.net> MIME-Version: 1.0 X-OriginalArrivalTime: 25 Oct 2010 08:58:29.0284 (UTC) FILETIME=[CAF33E40:01CB7422] --_ece7f319-8972-4488-a5cb-2318a476b9c8_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Chris=2C So it turned out updating Xalan fixed the problem completely. We went with Xalan 2.7.1 (which has Xerces 2.9.0 included). We replace 'xercesImpl.jar' and 'xml-apis.jar' in Tomcat's endorsed folder = and 'xalan-2.6.1-dev-20041008T0304.jar' with 'xalan.jar' from 2.7.1 and add= ed 'serializer.jar' both in our lib folder. Restarted Tomcat and the problem went away and nothing else on the site was= affected. In fact=2C it seems a little faster now. :) So now we're running find on CentOS 5=2C JDK 1.6.21 and Tomcat 5.0.28. - J > Date: Wed=2C 29 Sep 2010 09:41:55 -0400 > From: chris@christopherschultz.net > To: users@cocoon.apache.org > Subject: Re: System upgrade and now Cocoon is escaping tabs/entities. >=20 > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 >=20 > J=2C >=20 > On 9/29/2010 1:10 AM=2C . . wrote: > >> &#a9 should be a copyright symbol if you're using ASCII. > >> > >> I suspect that &#a9 is being used instead of a newline (0xa) followed = by > >> a tab (0x9). > >=20 > > Actually it was a typo on my part. It's using =3B :( *oops* >=20 > Yeah=2C that makes a ton of difference. I'm glad it wasn't 0xa9=2C 'cause > that would have been a real mess. :) >=20 > >> [file.encoding] is likely to solve both of your problems. > >=20 > > I wrote a little JSP page to spit out the > > System.getProperty("file.encoding") value and got some surprising > > results. I tried two of the existing machines and got ISO-8859-1 for on= e > > and ANSI_X3.4-1968 for the other. >=20 > ANSI_X3.4-1968=2C as you probably found out=2C is essentially basic ASCII= =2C > and ISO-8859-1 is ASCII plus a few other things=2C so they are compatible= . > It's not surprising that these two character sets are both working: if > one works=2C the other has a good chance of working. >=20 > > The application runs fine on both of them. On the new server that too > > is giving out ISO-8859-1. >=20 > Interesting. >=20 > > That said=2C we did an experiment last night and copied the entire > > previous Tomcat folder over to the new CentOS server and ran it with Su= n > > JDK 1.4.29 - the problem disappeared. When we ran it with JDK 1.5 or 1.= 6 > > the problem manifested itself. > >=20 > > So the problem appears to related to the JDK in some way. Googling I > > came up with this: > >=20 > > http://stackoverflow.com/questions/1059854/how-do-you-prevent-a-javax-t= ransformer-from-escaping-whitespace > >=20 > > Which makes me wonder if the old Xalan from our previous Tomcat is > > having issues with JDK 1.5 and up. I guess an Xalan upgrade is in order= . >=20 > Cocoon packages it's own Xalan library=2C so that shouldn't be the > problem=2C although I can't remember when Sun started packaging Xalan wit= h > Java. At some point=2C I think they even removed it. What version of Xala= n > are you running? It should be in your webapp's WEB-INF/lib directory. I > don't think there's been a Xalan update in quite a few years. >=20 > Let us know how things turn out. >=20 > >> NB: Tomcat 5.0 has been retired and really should be replaced. Upgradi= ng > >> to Tomcat 6.0 shouldn't be too much trouble. > >=20 > > Only issue there is we have to support this legacy application for > > another 12 months and it's a "hand me down" so we have little or no > > source code or documentation. Porting it now would take up more > > time/effort than is financially viable right now :( >=20 > Technically speaking=2C servlet containers are supposed to be backward > compatible. I wouldn't be surprised if=2C given a review of your > element for Tomcat (it should go into META-INF/context.xml=2C now in your > webapp=2C instead of in conf/server.xml for the server)=2C everything els= e > works exactly as it did before. >=20 > - -chris > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (MingW32) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ >=20 > iEYEARECAAYFAkyjQiMACgkQ9CaO5/Lv0PBtOACeKG7EgdIqh+vDNND8wFKAtGHM > N08AnjBBlR2cvmgIu1BfIDy79bMSAs7Q > =3Dh7CA > -----END PGP SIGNATURE----- >=20 > --------------------------------------------------------------------- > To unsubscribe=2C e-mail: users-unsubscribe@cocoon.apache.org > For additional commands=2C e-mail: users-help@cocoon.apache.org >=20 = --_ece7f319-8972-4488-a5cb-2318a476b9c8_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Chris=2C

So it turned out updating Xalan fixed the problem completel= y.

We went with Xalan 2.7.1 (which has Xerces 2.9.0 included).
We replace 'xercesImpl.jar' and 'xml-apis.jar' in Tomcat's endorsed folde= r and 'xalan-2.6.1-dev-20041008T0304.jar' with 'xalan.jar' from 2.7.1 and a= dded 'serializer.jar' both in our lib folder.

Restarted Tomcat and t= he problem went away and nothing else on the site was affected. In fact=2C = it seems a little faster now. :)

So now we're running find on CentOS= 5=2C JDK 1.6.21 and Tomcat 5.0.28.

- J





>= =3B Date: Wed=2C 29 Sep 2010 09:41:55 -0400
>=3B From: chris@christoph= erschultz.net
>=3B To: users@cocoon.apache.org
>=3B Subject: Re: = System upgrade and now Cocoon is escaping tabs/entities.
>=3B
>= =3B -----BEGIN PGP SIGNED MESSAGE-----
>=3B Hash: SHA1
>=3B
&= gt=3B J=2C
>=3B
>=3B On 9/29/2010 1:10 AM=2C . . wrote:
>= =3B >=3B>=3B &=3B#a9 should be a copyright symbol if you're using AS= CII.
>=3B >=3B>=3B
>=3B >=3B>=3B I suspect that &=3B#a= 9 is being used instead of a newline (0xa) followed by
>=3B >=3B>= =3B a tab (0x9).
>=3B >=3B
>=3B >=3B Actually it was a typo = on my part. It's using &=3B#9=3B :( *oops*
>=3B
>=3B Yeah=2C = that makes a ton of difference. I'm glad it wasn't 0xa9=2C 'cause
>=3B= that would have been a real mess. :)
>=3B
>=3B >=3B>=3B [fi= le.encoding] is likely to solve both of your problems.
>=3B >=3B >=3B >=3B I wrote a little JSP page to spit out the
>=3B >=3B S= ystem.getProperty("file.encoding") value and got some surprising
>=3B = >=3B results. I tried two of the existing machines and got ISO-8859-1 for= one
>=3B >=3B and ANSI_X3.4-1968 for the other.
>=3B
>= =3B ANSI_X3.4-1968=2C as you probably found out=2C is essentially basic ASC= II=2C
>=3B and ISO-8859-1 is ASCII plus a few other things=2C so they = are compatible.
>=3B It's not surprising that these two character sets= are both working: if
>=3B one works=2C the other has a good chance of= working.
>=3B
>=3B >=3B The application runs fine on both of = them. On the new server that too
>=3B >=3B is giving out ISO-8859-1.=
>=3B
>=3B Interesting.
>=3B
>=3B >=3B That said=2C= we did an experiment last night and copied the entire
>=3B >=3B pre= vious Tomcat folder over to the new CentOS server and ran it with Sun
&g= t=3B >=3B JDK 1.4.29 - the problem disappeared. When we ran it with JDK 1= .5 or 1.6
>=3B >=3B the problem manifested itself.
>=3B >=3B =
>=3B >=3B So the problem appears to related to the JDK in some way.= Googling I
>=3B >=3B came up with this:
>=3B >=3B
>=3B= >=3B http://stackoverflow.com/questions/1059854/how-do-you-prevent-a-jav= ax-transformer-from-escaping-whitespace
>=3B >=3B
>=3B >=3B = Which makes me wonder if the old Xalan from our previous Tomcat is
>= =3B >=3B having issues with JDK 1.5 and up. I guess an Xalan upgrade is i= n order.
>=3B
>=3B Cocoon packages it's own Xalan library=2C so = that shouldn't be the
>=3B problem=2C although I can't remember when S= un started packaging Xalan with
>=3B Java. At some point=2C I think th= ey even removed it. What version of Xalan
>=3B are you running? It sho= uld be in your webapp's WEB-INF/lib directory. I
>=3B don't think ther= e's been a Xalan update in quite a few years.
>=3B
>=3B Let us k= now how things turn out.
>=3B
>=3B >=3B>=3B NB: Tomcat 5.0 h= as been retired and really should be replaced. Upgrading
>=3B >=3B&g= t=3B to Tomcat 6.0 shouldn't be too much trouble.
>=3B >=3B
>= =3B >=3B Only issue there is we have to support this legacy application f= or
>=3B >=3B another 12 months and it's a "hand me down" so we have = little or no
>=3B >=3B source code or documentation. Porting it now = would take up more
>=3B >=3B time/effort than is financially viable = right now :(
>=3B
>=3B Technically speaking=2C servlet container= s are supposed to be backward
>=3B compatible. I wouldn't be surprised= if=2C given a review of your <=3BContext>=3B
>=3B element for Tom= cat (it should go into META-INF/context.xml=2C now in your
>=3B webapp= =2C instead of in conf/server.xml for the server)=2C everything else
>= =3B works exactly as it did before.
>=3B
>=3B - -chris
>=3B= -----BEGIN PGP SIGNATURE-----
>=3B Version: GnuPG v1.4.10 (MingW32)>=3B Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/>=3B
>=3B iEYEARECAAYFAkyjQiMACgkQ9CaO5/Lv0PBtOACeKG7EgdIqh+vDNND8= wFKAtGHM
>=3B N08AnjBBlR2cvmgIu1BfIDy79bMSAs7Q
>=3B =3Dh7CA
&g= t=3B -----END PGP SIGNATURE-----
>=3B
>=3B ---------------------= ------------------------------------------------
>=3B To unsubscribe= =2C e-mail: users-unsubscribe@cocoon.apache.org
>=3B For additional co= mmands=2C e-mail: users-help@cocoon.apache.org
>=3B
= --_ece7f319-8972-4488-a5cb-2318a476b9c8_--