Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 61344 invoked from network); 4 Sep 2005 19:11:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Sep 2005 19:11:56 -0000 Received: (qmail 80988 invoked by uid 500); 4 Sep 2005 19:11:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 80957 invoked by uid 500); 4 Sep 2005 19:11:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 80944 invoked by uid 99); 4 Sep 2005 19:11:52 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Sep 2005 12:11:52 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of sean@oconeco.com designates 216.239.128.26 as permitted sender) Received: from [216.239.128.26] (HELO smtp.omnis.com) (216.239.128.26) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Sep 2005 12:12:05 -0700 Received: from [127.0.0.1] (68-235-161-43.chvlva.adelphia.net [68.235.161.43]) by smtp-relay.omnis.com (Postfix) with ESMTP id 9AB9720068AB for ; Sun, 4 Sep 2005 12:11:49 -0700 (PDT) Message-ID: <431B4720.1090300@oconeco.com> Date: Sun, 04 Sep 2005 15:12:32 -0400 From: Sean O'Connor User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: Phrase frequency References: <29788b2505090207325a920779@mail.gmail.com> In-Reply-To: <29788b2505090207325a920779@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N I believe the index just contains information about single terms. A PhraseQuery then searches the index for the parts of the phrase and returns the hit information. So, as far as I understand, there is no way to get the frequency of phrase directly from an index, but you could create a PhraseQuery, and use an IndexSearcher to return the Hits. That will provide only weighted hit scores, which does not sound like what you want. This may be similar to a question I posted back on June 16th. Paul Elschot was kind enough to give me feedback. Search for that in the archives, or try this link: http://mail-archives.apache.org/mod_mbox/lucene-java-user/200506.mbox/%3c200506162239.53497.paul.elschot@xs4all.nl%3e In summary, he suggests modify (extend?) PhraseQuery and ExactPhraseScorer. I have gotten sidelined on trying to get character positions for hits, so have not completed his suggested implementation. If I do, I would be happy to share. Good luck, and feel free to post anything you think might be helpful if you implement something. Sean Fabio Cristiano dos Anjos wrote: >Hi, > >How can I get phrase frequency in an index? > >Thanks in advance!! > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org