Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2ECE5D7D9 for ; Sun, 26 Aug 2012 01:07:08 +0000 (UTC) Received: (qmail 22661 invoked by uid 500); 26 Aug 2012 01:07:05 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 22598 invoked by uid 500); 26 Aug 2012 01:07:05 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 22589 invoked by uid 99); 26 Aug 2012 01:07:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Aug 2012 01:07:05 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [204.194.78.37] (HELO mailserver1.caci.com) (204.194.78.37) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Aug 2012 01:06:58 +0000 Received: from ex2010ch01.caci.com ([172.16.247.26]) by mailserver1.caci.com with ESMTP/TLS/AES128-SHA; 25 Aug 2012 21:06:37 -0400 Received: from EX2010MB01-1.caci.com ([fe80::d5c4:c244:1486:79fc]) by ex2010ch01.caci.com ([::1]) with mapi id 14.01.0379.000; Sat, 25 Aug 2012 21:06:37 -0400 From: Ilya Zavorin To: "java-user@lucene.apache.org" Subject: RE: Efficient string lookup using Lucene Thread-Topic: Efficient string lookup using Lucene Thread-Index: Ac2CMXhOot/bT+OdTRyEGD8oonSgTQAK1XKAADKLxwA= Date: Sun, 26 Aug 2012 01:06:36 +0000 Message-ID: References: <1345841937.750.YahooMailClassic@web121704.mail.ne1.yahoo.com> In-Reply-To: <1345841937.750.YahooMailClassic@web121704.mail.ne1.yahoo.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.29.10.35] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Does it mean that the resulting index will be very large? Thanks, Ilya -----Original Message----- From: Ahmet Arslan [mailto:iorixxx@yahoo.com]=20 Sent: Friday, August 24, 2012 4:59 PM To: java-user@lucene.apache.org Subject: Re: Efficient string lookup using Lucene > search for a string "run", I do not need to find "ran" but I do want=20 > to find it in all of these strings below: >=20 > Fox is running fast > !%#^&$run!$!%@&$# > run,run With NGramFilter you can do that. But it creates a lot of tokens. For examp= le "Fox is running fast" becomes=20 F =09 o =09 x =09 Fo =09 ox =09 Fox =09 i =09 s =09 is =09 r =09 u =09 n =09 n =09 i =09 n =09 g =09 ru =09 un =09 nn =09 ni =09 in =09 ng =09 *run* =09 unn =09 nni =09 nin =09 ing =09 runn =09 unni =09 nnin =09 ning =09 runni =09 unnin =09 nning =09 runnin =09 unning =09 running =09 f =09 a =09 s =09 t =09 fa =09 as =09 st =09 fas =09 ast =09 fast --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org