Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 71256 invoked from network); 30 Aug 2008 19:55:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Aug 2008 19:55:47 -0000 Received: (qmail 30629 invoked by uid 500); 30 Aug 2008 19:55:38 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 30598 invoked by uid 500); 30 Aug 2008 19:55:38 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 30587 invoked by uid 99); 30 Aug 2008 19:55:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Aug 2008 12:55:38 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of raymond.balmes@gmail.com designates 72.14.204.233 as permitted sender) Received: from [72.14.204.233] (HELO qb-out-0506.google.com) (72.14.204.233) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Aug 2008 19:54:40 +0000 Received: by qb-out-0506.google.com with SMTP id e6so2078778qbe.27 for ; Sat, 30 Aug 2008 12:55:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type; bh=gq78C3QDkmsPUrXiEDAoUh8AthHjEQIPlhF2PTo29SM=; b=ppXy22bPm5XjlIPIBQUsghl7bQlL2quM1OrSntcoEMEaeD/WSKZpHvcdFcrzjl7r+X X9P7fPa2FPieZ1pJYWXAso0sCaIELu7RZMzksM5f9aj2TgQwg1HQItOQF34QHlr05hUS U5TiATYP/U3gq4gSuptn3w8bQklWT71gNhNdk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type; b=EdsHBze5Zkgq/qu25RAK/uyveCQBgBN1k1KflToKnkIkQPpH9AAGWfojfX5fvdS1YH xU1K4QHcEXuJPozc7nJI0AU9E+QDgLqQ7TSetixR3cfnKNT1RGpkVBdYxPitADzRr9Ks e66cQswdNC+QG1dCFAFgasJ2SqqTWZcCzr97c= Received: by 10.103.17.10 with SMTP id u10mr2982963mui.97.1220126110029; Sat, 30 Aug 2008 12:55:10 -0700 (PDT) Received: by 10.103.214.9 with HTTP; Sat, 30 Aug 2008 12:55:09 -0700 (PDT) Message-ID: <4014d98b0808301255n52ff8c41sa7079eff0940f0cf@mail.gmail.com> Date: Sat, 30 Aug 2008 21:55:09 +0200 From: "=?ISO-8859-1?Q?Raymond_Balm=E8s?=" To: java-user@lucene.apache.org Subject: Beginner: Specific indexing MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_34147_21361781.1220126110023" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_34147_21361781.1220126110023 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi guys, Fairly new to Lucene, and just finished reading Lucene in Action. My problem is the following I need to index the documents that only contains the following pattern(s) in a mass of documents: <#1> <#2> is a fixed list of words <#x> are small numbers <100 My idea is to simply build a TokenFilter that will look for those... do I have it right ? Some side questions: what if I want to index <#1> <#2> as keywords ? what if I also want to give full text search on the select documents ? Thx for your help ------=_Part_34147_21361781.1220126110023--