From java-user-return-48062-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Wed Dec 08 13:44:00 2010 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54821 invoked from network); 8 Dec 2010 13:43:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Dec 2010 13:43:59 -0000 Received: (qmail 29302 invoked by uid 500); 8 Dec 2010 13:43:57 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 29038 invoked by uid 500); 8 Dec 2010 13:43:57 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 29030 invoked by uid 99); 8 Dec 2010 13:43:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Dec 2010 13:43:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Dec 2010 13:43:49 +0000 Received: by vws18 with SMTP id 18so992836vws.35 for ; Wed, 08 Dec 2010 05:43:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=v8tYAGcfiD+lGF0UFdbCqlXTAcD1ndV4KW1GrBjkxBE=; b=QWUmIyjE4UGF/VWacWZ0feI+8JZIkJkf1zHGonoR6kHG8DxfyUDt43sHDxvx5FdQt8 rHCjjQsC1rN0xdyUI6yN00n1E7JM8dqIIkq3Ndgri5UeLYfti5/hFdyQbaRw/sbV7Tuw 3evYNzEyp8W9po+s9IDK0P1j3g6W6ifegUpDQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=DhdyHAor7KbGd60XJ61r7+NWg0tZXlNt6oXWUA22HJQ0oMN4zs0ZFSQRMKyKUX97kE 5EO2WQPgAU9wgx6J/eMNZz/ZEQY8v2t1SuUNcC+P5ng8/BnNOulDJ2IRc6Qt9aJ4vXPd YXDA0oyOVb62FEWZ5iTh075MhSRQMZDEJ3EgU= MIME-Version: 1.0 Received: by 10.229.220.81 with SMTP id hx17mr7629990qcb.38.1291815808326; Wed, 08 Dec 2010 05:43:28 -0800 (PST) Received: by 10.229.235.208 with HTTP; Wed, 8 Dec 2010 05:43:28 -0800 (PST) In-Reply-To: References: Date: Wed, 8 Dec 2010 08:43:28 -0500 Message-ID: Subject: Re: FW: Re: lucene3.0.2: getting incorrect no. of occurrence in file From: Erick Erickson To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=00163630eb235edc2f0496e64ecc X-Virus-Checked: Checked by ClamAV on apache.org --00163630eb235edc2f0496e64ecc Content-Type: text/plain; charset=ISO-8859-1 I don't think this code is doing anything predictable. From the javadocs for TermDocs.skipTo(): Skips entries to the first beyond the current whose document number is greater than or equal to *target*. Returns true iff there is such an entry. You're not testing the return value from skipTo. The document IDs returned in scoredocs aren't necessarily in ascending doc ID order so I'd guess that skipTo is returning false a lot of the time and in that case you're getting the frequency of a doc that isn't in your result set. Best Erick On Wed, Dec 8, 2010 at 12:33 AM, Ranjit Kumar wrote: > Hi, > Thanks for your replay!!! > Below is code I am using for search > String line="sql server"; > IndexReader reader = IndexReader.open(FSDirectory.open(new > File(indexpath)), true); // contains index file path > Searcher searcher = new IndexSearcher(reader); > Analyzer analyzer = new > StandardAnalyzer(Version.LUCENE_CURRENT); > QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, field, > analyzer); > if (line != null) { > line = line.trim(); > Query query = parser.parse(line); > int n = 100; > TopDocs docs = searcher.search(query, n); > TermDocs termDocs = reader.termDocs(new Term(field, line. > toLowerCase())); > System.out.println(" " + docs.totalHits + " total matching > documents\n"); > for (int i = 0; i < docs.scoreDocs.length; i++) { > int totalFreq = 0; > Document document = > reader.document(docs.scoreDocs[i].doc); > termDocs.skipTo(i); > String path = document.get("path"); > System.out.println("path>>" + path); > totalFreq = termDocs.freq(); > System.out.println("totalFreq >>" + totalFreq); > } > > } > > > Thanks & Regards, > Ranjit Kumar > =================================================================================================== > Private, Confidential and Privileged. This e-mail and any files and > attachments transmitted with it are confidential and/or privileged. They are > intended solely for the use of the intended recipient. The content of this > e-mail and any file or attachment transmitted with it may have been changed > or altered without the consent of the author. If you are not the intended > recipient, please note that any review, dissemination, disclosure, > alteration, printing, circulation or Transmission of this e-mail and/or any > file or attachment transmitted with it, is prohibited and may be unlawful. > If you have received this e-mail or any file or attachment transmitted with > it in error please notify OTS Solutions at info@otssolutions.com=================================================================================================== > --00163630eb235edc2f0496e64ecc--