Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 38506 invoked from network); 24 Jul 2006 02:24:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 24 Jul 2006 02:24:46 -0000 Received: (qmail 27979 invoked by uid 500); 24 Jul 2006 02:24:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 27950 invoked by uid 500); 24 Jul 2006 02:24:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 27939 invoked by uid 99); 24 Jul 2006 02:24:40 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Jul 2006 19:24:40 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [206.190.52.175] (HELO smtp106.biz.mail.re2.yahoo.com) (206.190.52.175) by apache.org (qpsmtpd/0.29) with SMTP; Sun, 23 Jul 2006 19:24:39 -0700 Received: (qmail 69043 invoked from network); 24 Jul 2006 02:24:15 -0000 Received: from unknown (HELO EAGLE) (hwu@welpine.com@68.251.76.180 with login) by smtp106.biz.mail.re2.yahoo.com with SMTP; 24 Jul 2006 02:24:15 -0000 From: "Herbert Wu" To: Subject: RE: Special characher & ; : % index/search question Date: Sun, 23 Jul 2006 21:24:09 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 In-Reply-To: <359a92830607230855r69159933p1ec92e62e114900d@mail.gmail.com> Thread-Index: AcaucISMxCa8H+CORr6A0hMyseNbygAV004g X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2869 X-Virus-Checked: Checked by ClamAV on apache.org Message-Id: <20060724022439.E554D10FB009@asf.osuosl.org> X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N WhitespaceAnalyzer looks brutal. Is it possible that I keep StandardAnalyzer and at the same time to tell the parser to keep a list of chars during indexing? -Herbert -----Original Message----- From: Erick Erickson [mailto:erickerickson@gmail.com] Sent: Sunday, July 23, 2006 10:56 AM To: java-user@lucene.apache.org Subject: Re: Special characher & ; : % index/search question the WhitespaceAnalyzer breaks up streams on whitespace, and will give you these characters as tokens. Be careful to use it for indexing AND searching. Also, make sure that's the analyzer in Luke if you submit queries that way (it's a drop-down on the search page, upper right as I remember). On 7/22/06, Herbert Wu wrote: > > Hi, all, > > My document's title field contains standalone(not contained inside a word) > special char such as &,:,%,; etc. With luke0.6 tool, I found that these > chars are not indexed in the title field or any other place and hence not > searchable. Is there any way to index these special chars for search? My > env > are: > > Lucene: version 2.0.0 > > Index parser: org.apache.lucene.analysis.standard.StandardAnalyzer > > JDK: Java1.5 > > OS: XP sp2 > > Debugger: luke0.6 > > > > Any help is greatly appreciated! > > > > -Herbert > > > > > > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org