Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 8224 invoked from network); 21 Jul 2006 19:59:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 21 Jul 2006 19:59:45 -0000 Received: (qmail 99804 invoked by uid 500); 21 Jul 2006 19:59:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 99322 invoked by uid 500); 21 Jul 2006 19:59:39 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 99310 invoked by uid 99); 21 Jul 2006 19:59:39 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jul 2006 12:59:39 -0700 X-ASF-Spam-Status: No, hits=0.3 required=10.0 tests=MAILTO_TO_SPAM_ADDR X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [209.134.168.15] (HELO atlmaiexcp07.iss.local) (209.134.168.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jul 2006 12:59:38 -0700 Received: from soumaiexcp01.iss.local ([207.231.129.197]) by atlmaiexcp07.iss.local with Microsoft SMTPSVC(5.0.2195.6713); Fri, 21 Jul 2006 15:59:15 -0400 X-MimeOLE: Produced By Microsoft Exchange V6.0.6603.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: StandardAnalyzer question Date: Fri, 21 Jul 2006 15:59:14 -0400 Message-ID: <51540A3DDD507D40B6D47030C8C0C19101A3A1F1@soumaiexcp01.iss.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: StandardAnalyzer question Thread-Index: Acas/wpOiZEFxeqqTsmAUIpjOONFKAAAP4Qg From: "Ngo, Anh \(ISS Southfield\)" To: X-OriginalArrivalTime: 21 Jul 2006 19:59:15.0603 (UTC) FILETIME=[24D2AE30:01C6AD00] X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hello Mark, Please show me how to add "-" to #LETTER definition Thanks, Anh Ngo -----Original Message----- From: Mark Miller [mailto:markrmiller@gmail.com]=20 Sent: Friday, July 21, 2006 3:51 PM To: java-user@lucene.apache.org Subject: Re: StandardAnalyzer question I do not beleive so. If you look above you will see that #P is only used when looking for a num: a host ip, a phone number, etc. You will be removing that ability to recognize a "_" while rooting those tokens out. It will still be parsed when tokenizing an EMAIL as well. I dont think this is the behavior you want. - Mark On 7/21/06, Ngo, Anh (ISS Southfield) wrote: > > > What is #LETTER definition in SnardarTokernize.jj? > > > I saw: > > | <#P: ("_"|"-"|"/"|"."|",") > > | <#HAS_DIGIT: // at least one digit > (|)* > > (|)* > > > > > Should I remove "_" and recompile the source code? > > Sincerely, > > > Anh Ngo > > -----Original Message----- > From: Daniel Naber [mailto:lucenelist2005@danielnaber.de] > Sent: Friday, July 21, 2006 2:49 PM > To: java-user@lucene.apache.org > Subject: Re: StandardAnalyzer question > > On Freitag 21 Juli 2006 16:16, Ngo, Anh (ISS Southfield) wrote: > > > The lucene 2.0.0 StandardAnalyzer does treat the "_"(underscore) as a > > token. Is there a way I can make StandardAnalyzer don't tokenize for > > "_" or any given characters? > > You need to add "_" to the #LETTER definition in StandardTokenizer.jj, > then > rebuild StandardTokenizer.java using the appropriate and task. > > Regards > Daniel > > -- > http://www.danielnaber.de > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org