So it turned out updating Xalan fixed the problem completely.
We went with Xalan 2.7.1 (which has Xerces 2.9.0 included).
We replace 'xercesImpl.jar' and 'xml-apis.jar' in Tomcat's endorsed folder and 'xalan-2.6.1-dev-20041008T0304.jar' with 'xalan.jar' from 2.7.1 and added 'serializer.jar' both in our lib folder.
Restarted Tomcat and the problem went away and nothing else on the site was affected. In fact, it seems a little faster now. :)
So now we're running find on CentOS 5, JDK 1.6.21 and Tomcat 5.0.28.
> Date: Wed, 29 Sep 2010 09:41:55 -0400 > From: email@example.com > To: firstname.lastname@example.org > Subject: Re: System upgrade and now Cocoon is escaping tabs/entities. > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > J, > > On 9/29/2010 1:10 AM, . . wrote: > >> &#a9 should be a copyright symbol if you're using ASCII. > >> > >> I suspect that &#a9 is being used instead of a newline (0xa) followed by > >> a tab (0x9). > > > > Actually it was a typo on my part. It's using 	 :( *oops* > > Yeah, that makes a ton of difference. I'm glad it wasn't 0xa9, 'cause > that would have been a real mess. :) > > >> [file.encoding] is likely to solve both of your problems. > > > > I wrote a little JSP page to spit out the > > System.getProperty("file.encoding") value and got some surprising > > results. I tried two of the existing machines and got ISO-8859-1 for one > > and ANSI_X3.4-1968 for the other. > > ANSI_X3.4-1968, as you probably found out, is essentially basic ASCII, > and ISO-8859-1 is ASCII plus a few other things, so they are compatible. > It's not surprising that these two character sets are both working: if > one works, the other has a good chance of working. > > > The application runs fine on both of them. On the new server that too > > is giving out ISO-8859-1. > > Interesting. > > > That said, we did an experiment last night and copied the entire > > previous Tomcat folder over to the new CentOS server and ran it with Sun > > JDK 1.4.29 - the problem disappeared. When we ran it with JDK 1.5 or 1.6 > > the problem manifested itself. > > > > So the problem appears to related to the JDK in some way. Googling I > > came up with this: > > > > http://stackoverflow.com/questions/1059854/how-do-you-prevent-a-javax-transformer-from-escaping-whitespace > > > > Which makes me wonder if the old Xalan from our previous Tomcat is > > having issues with JDK 1.5 and up. I guess an Xalan upgrade is in order. > > Cocoon packages it's own Xalan library, so that shouldn't be the > problem, although I can't remember when Sun started packaging Xalan with > Java. At some point, I think they even removed it. What version of Xalan > are you running? It should be in your webapp's WEB-INF/lib directory. I > don't think there's been a Xalan update in quite a few years. > > Let us know how things turn out. > > >> NB: Tomcat 5.0 has been retired and really should be replaced. Upgrading > >> to Tomcat 6.0 shouldn't be too much trouble. > > > > Only issue there is we have to support this legacy application for > > another 12 months and it's a "hand me down" so we have little or no > > source code or documentation. Porting it now would take up more > > time/effort than is financially viable right now :( > > Technically speaking, servlet containers are supposed to be backward > compatible. I wouldn't be surprised if, given a review of your <Context> > element for Tomcat (it should go into META-INF/context.xml, now in your > webapp, instead of in conf/server.xml for the server), everything else > works exactly as it did before. > > - -chris > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (MingW32) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAkyjQiMACgkQ9CaO5/Lv0PBtOACeKG7EgdIqh+vDNND8wFKAtGHM > N08AnjBBlR2cvmgIu1BfIDy79bMSAs7Q > =h7CA > -----END PGP SIGNATURE----- > > --------------------------------------------------------------------- > To unsubscribe, e-mail: email@example.com > For additional commands, e-mail: firstname.lastname@example.org >