Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 82403 invoked from network); 6 Mar 2004 16:53:44 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 6 Mar 2004 16:53:44 -0000 Received: (qmail 31768 invoked by uid 500); 6 Mar 2004 16:53:35 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 31709 invoked by uid 500); 6 Mar 2004 16:53:35 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 31696 invoked from network); 6 Mar 2004 16:53:35 -0000 Received: from unknown (HELO mout.perfora.net) (217.160.230.41) by daedalus.apache.org with SMTP; 6 Mar 2004 16:53:35 -0000 Received: from [217.160.230.52] (helo=smtp.perfora.net) by mout.perfora.net with esmtp (Exim 3.35 #1) id 1Azf3R-0005Ue-00 for dev@cocoon.apache.org; Sat, 06 Mar 2004 11:53:37 -0500 Received: from [208.185.179.12] (helo=reverycodes.com) by smtp.perfora.net with asmtp (TLSv1:RC4-MD5:128) (Exim 3.35 #1) id 1Azf3R-0003UE-00 for dev@cocoon.apache.org; Sat, 06 Mar 2004 11:53:37 -0500 Message-ID: <404A020E.8010100@reverycodes.com> Date: Sat, 06 Mar 2004 11:53:34 -0500 From: Vadim Gritsenko User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: cvs commit: cocoon-2.1 status.xml References: <20040306144446.48108.qmail@minotaur.apache.org> In-Reply-To: <20040306144446.48108.qmail@minotaur.apache.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N joerg@apache.org wrote: > public void characters(char[] ch, int start, int length) { > > if (ch.length > 0 && start >= 0 && length > 1) { > - String text = new String(ch, start, length); > if (elementStack.size() > 0) { > IndexHelperField tos = (IndexHelperField) elementStack.peek(); > - tos.appendText(text); > + tos.appendText(ch, start, length); > } > - bodyText.append(text); > + bodyText.append(' '); > + bodyText.append(ch, start, length); > } > } > What will happen when "keyword" text is streamed as two characters events, "key" and "word"? I think it will become "key word", and indexing will break. IIUC, idea was to add a space in between tags, i.e. so

some

text

is not indexed as "sometext". If that's correct, then better fix would be to add space only if boolean flag had_start_or_end_element_in_between_char_events set. Vadim