Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 12323 invoked from network); 14 Jun 2007 13:23:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Jun 2007 13:23:41 -0000 Received: (qmail 48756 invoked by uid 500); 14 Jun 2007 13:23:36 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 48724 invoked by uid 500); 14 Jun 2007 13:23:36 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 48708 invoked by uid 99); 14 Jun 2007 13:23:36 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jun 2007 06:23:36 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of grant.ingersoll@gmail.com designates 66.249.82.233 as permitted sender) Received: from [66.249.82.233] (HELO wx-out-0506.google.com) (66.249.82.233) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jun 2007 06:23:32 -0700 Received: by wx-out-0506.google.com with SMTP id t8so441260wxc for ; Thu, 14 Jun 2007 06:23:11 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:mime-version:in-reply-to:references:content-type:message-id:content-transfer-encoding:from:subject:date:to:x-mailer; b=bJjGYMHcFbygHw6GJ+/oMZNZ/g6Gm8zMe/mf5BY2gr0dNZX0DGIAVthx+oHo0/Ldw1E4wcjalSXVtiIfFflk6mDwQs18Dbb/Fg4vFJNNSzsfpmTlHh6cxDsyy1az3d8MzkgDdUZcDnY7AYSbCCf0CufigRoAxuRUyMp6+QEzKSw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:mime-version:in-reply-to:references:content-type:message-id:content-transfer-encoding:from:subject:date:to:x-mailer; b=DzcWg4g+0yhYLSHh4UZNFid9eACuyr09KBFXDwIwAQgCRrpG3rcmpL6xePll/Q6lck244WfnyyrJDKbDT12uubyTx4+4DmnwyzQIcv6ewl9Iu5n92LtowUuFKTyFg5/0bIjlWAk7ypPHwqeVvw1Iz3PLChQ7QBpKb60ydI9LztY= Received: by 10.90.106.11 with SMTP id e11mr1334933agc.1181827390880; Thu, 14 Jun 2007 06:23:10 -0700 (PDT) Received: from ?192.168.0.2? ( [74.229.189.244]) by mx.google.com with ESMTP id 10sm2176105wrl.2007.06.14.06.23.10 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 14 Jun 2007 06:23:10 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <1181756179.5944.25.camel@pipe> References: <1181756179.5944.25.camel@pipe> Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: <59F2E134-2144-474B-AECE-46FC026EBCEA@gmail.com> Content-Transfer-Encoding: quoted-printable From: Grant Ingersoll Subject: Re: In which position of a document a word was found? Date: Thu, 14 Jun 2007 09:23:07 -0400 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org Have a look at the SpanQuery (starting at page 161 in LIA or in the =20 javadocs). I also have some info in my ApacheCon talk at http://=20 www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf and http://=20 www.cnlp.org/apachecon2005 Incidentally, the SpanQuery functionality does not require =20 TermVectors, so if you don't need them otherwise, you would get a =20 smaller index size. Cheers, Grant On Jun 13, 2007, at 1:36 PM, Felipe S=E1nchez Mart=EDnez wrote: > Hi all, > > I am new to Lucene and I have been reading the book "Lucene In =20 > Action", > here is my question: > > When searching for a word through an index is there any way to know in > which positions (may be more than one) of each document that word was > found? > > The index is constructed in the following way: > --------------------- > IndexWriter writer =3D new IndexWriter("/path/to/the/index/dir", > new StandardAnalyzer(), true); > > writer.setUseCompoundFile(false); > > Document doc =3D new Document(); > > doc.add(new Field("contents", > new FileReader(f),Field.TermVector.WITH_POSITIONS_OFFSETS)); > > doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES, > Field.Index.NO_NORMS)); > > writer.addDocument(doc); > ..... > -------------------- > > > Thanks in advance > -- > Felipe. > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > ------------------------------------------------------ Grant Ingersoll http://www.grantingersoll.com/ http://lucene.grantingersoll.com http://www.paperoftheweek.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org