Return-Path: Delivered-To: apmail-cocoon-users-archive@www.apache.org Received: (qmail 77240 invoked from network); 9 Mar 2004 22:56:26 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 9 Mar 2004 22:56:26 -0000 Received: (qmail 84572 invoked by uid 500); 9 Mar 2004 22:56:05 -0000 Delivered-To: apmail-cocoon-users-archive@cocoon.apache.org Received: (qmail 84445 invoked by uid 500); 9 Mar 2004 22:56:04 -0000 Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: users@cocoon.apache.org Delivered-To: mailing list users@cocoon.apache.org Received: (qmail 84430 invoked from network); 9 Mar 2004 22:56:04 -0000 Received: from unknown (HELO Princeton.EDU) (128.112.129.75) by daedalus.apache.org with SMTP; 9 Mar 2004 22:56:04 -0000 Received: from smtpserver2.Princeton.EDU (smtpserver2.Princeton.EDU [128.112.129.148]) by Princeton.EDU (8.12.9/8.12.9) with ESMTP id i29MuAOp000895 for ; Tue, 9 Mar 2004 17:56:10 -0500 (EST) Received: from ALVARADO (vermeer.Princeton.EDU [128.112.233.160]) (authenticated bits=0) by smtpserver2.Princeton.EDU (8.12.9/8.12.9) with ESMTP id i29Mu7ww011741 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for ; Tue, 9 Mar 2004 17:56:10 -0500 (EST) Message-Id: <200403092256.i29Mu7ww011741@smtpserver2.Princeton.EDU> From: "Rafael Alvarado" To: Subject: RE: My XSP is entifying my UTF Date: Tue, 9 Mar 2004 17:54:08 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 In-reply-to: <404E493B.4060006@gmx.de> Thread-Index: AcQGKGLYacqx/9ZeRDK5e0CFkqeGdQAAI5UA X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Here is my situation. I run an etext server with documents written in several languages. In creating a search interface for a collection of Hebrew documents, for example, I want to pull a distinct list of words from a db and create a set of lists for users to search with. The values have to be in unicode, since they will be sent back to the database as a query string. I don't want to have to translate entities back and forth into UTF8 -- I would rather work in UTF8 and forget entities forever. By the way, I had a similar problem with the HTML generator that uses Jtidy -- is this, too, the fault of Xalan? Rafael C. Alvarado Manager of Humanities Computing Research Applications 316 87 Prospect | Princeton University -----Original Message----- From: Joerg Heinicke [mailto:joerg.heinicke@gmx.de] Sent: Tuesday, March 09, 2004 5:46 PM To: users@cocoon.apache.org Subject: Re: My XSP is entifying my UTF On 09.03.2004 23:31, Rafael Alvarado wrote: > OK, thanks for the clarification. So, then, if Xalan is the culprit, > can it be replaced in Cocoon? My memory says no, but I'll have to take > a look. If it cannot be replaced, then I'll probably have to drop Cocoon! Much to my regret! Why are the entified characters so problematic? The good news: Cocoon does not depend on Xalan, but only on a JAXP compatible processor. So it can be replaced. I know few people using Saxon for example. The bad news: if you use JDK 1.4 Xalan is delivered with the JDK and it will be a bit more difficult to get it not used by Cocoon (e.g. by the ParanoidCocoonServlet). Joerg --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org For additional commands, e-mail: users-help@cocoon.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org For additional commands, e-mail: users-help@cocoon.apache.org