lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Frequency Term of Composite words
Date Wed, 16 Dec 2009 17:26:50 GMT
You need the term frequency vector.

See here
http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/index/IndexReader.html#getTermFreqVector%28int,%20java.lang.String%29

This is compatible in 3.0 as well:
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/index/IndexReader.html#getTermFreqVector%28int,%20java.lang.String%29

Note the package change.


On Wed, Dec 16, 2009 at 7:34 AM, Antonio Calò <anton.calo@gmail.com> wrote:

> I All
>
> I Hope that you can help me on this.
>
> I'm looking for a fast way to obtainf for a given word, its term frequency
> (I mean how many times it is available in a single doc). I've looking into
> mail archive and LIA (Lucene In Action) book and I found something like
> this:
>
> IndexSearcher index = new IndexSearcher(invertedIndexinRam);
> Term term = new Term("doc", "quick");
> int occurrence = index.docFreq(term);
>
> ok, occurrence contains the occurrences of the word "quick" into the index
> (In my case the index will contain only one document example "the quick
> brown fox jumps over the lazy dog"). In this case the occurrence will be 1.
> :)
>
> But now I need to retrieve the occurrency of a composite word: as example
> "quick brown fox" but I'm quite in trouble on how could I perform this.
>
> Thanks in advance for your help.
>
> Best Regards.
>
> Antonio
>
>
>
> --
> Antonio Calò
> ------------------------------------------
> Software Developer Engineer
> @ Intellisemantic
> Mail anton.calo@gmail.com
> Tel. 011-56.90.429
> ------------------------------------------
>



-- 
Ted Dunning, CTO
DeepDyve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message