Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 2535 invoked from network); 2 Jun 2010 05:57:18 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Jun 2010 05:57:18 -0000 Received: (qmail 44826 invoked by uid 500); 2 Jun 2010 05:57:16 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 44788 invoked by uid 500); 2 Jun 2010 05:57:16 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 44780 invoked by uid 99); 2 Jun 2010 05:57:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Jun 2010 05:57:15 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bec.watson@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Jun 2010 05:57:11 +0000 Received: by pwj10 with SMTP id 10so3857441pwj.35 for ; Tue, 01 Jun 2010 22:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=+FG3GuqqLuJjhx4uj9FYNSLakXyjW+60Rd2jMd8ETbY=; b=irCsQZHBbjJQmrl+pLg/az9n+wkp0uQJnMztnomuus0O6mPpHRFDexAtk9L5kT3OAd ZowXhD+KU+HUttbp4+YmpSzn9O1LDhz3oksFRzLWQIsFUFM8oMLpGlCJXrqU4fS5m5Hl Wmo7eJfW0BMt0Q/pzMsVlJLy4/OterJP4/Fz0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=cHNHQDEAUhgXgO4QpMQHg/W/veS2KZ1+61waIRnl/uunCd4evRK+9Z9iE+oXWchvW3 OuV2C2P3tq3ccRfBUvP0vNu1Swy3N/oBABxUEKuKpl12tR/y9yw3Sp+gP/ut1QtUssB0 HbVN9ecWruZzfeO82KzoaKIFb5Gj4U1PwJ0X0= MIME-Version: 1.0 Received: by 10.141.2.21 with SMTP id e21mr5943805rvi.193.1275458210601; Tue, 01 Jun 2010 22:56:50 -0700 (PDT) Received: by 10.140.164.15 with HTTP; Tue, 1 Jun 2010 22:56:50 -0700 (PDT) In-Reply-To: <1275425617086-862859.post@n3.nabble.com> References: <1275425617086-862859.post@n3.nabble.com> Date: Wed, 2 Jun 2010 13:56:50 +0800 Message-ID: Subject: Re: Problem fetching number of occurrences From: Rebecca Watson To: java-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 hi when you are indexing, use termvectors org.apache.lucene.document.Field.TermVector set this in the Field object constructor when you create your Field objects at index time. i've never done it but i'm pretty sure these can be retrieved at search time using one of the IndexReader.getTermFreqVector methods. lucene in action has a really good section on using termfreqvectors: http://www.manning.com/hatcher3/ if you want the positional info too e.g. the two positions of the "question" word in your example then have a look at the org.apache.lucene.search.spans.SpanTermQuery class -- in the getSpans method -- it grabs the terms + positions using the IndexReader as well: reader.termPositions(term) hope that helps, bec :) On 2 June 2010 04:53, Sirish Vadala wrote: > > Hello All: > > Can any one suggest me the best way to get the no. of occurrences of each > word per document in Lucene? > > Eg: Let the indexed text be: > > If you are posting a question, please try search first. Your question may > have already been answered. > > Now if I search for the word 'question', then I would like to get this > document along with the number of occurrences of question in the document, > in the above case it would be 2. > > Any hint would be appreciated. > Thanks. > > > -- > View this message in context: http://lucene.472066.n3.nabble.com/Problem-fetching-number-of-occurrences-tp862859p862859.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org