lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: About counting term hits
Date Thu, 13 Nov 2008 23:11:53 GMT
The more Documents you have to look at the slower it will be, but it may still be fast enough
- it's impossible to tell without considering index size, query volume, hardware, number of
hits/Docs, etc.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch




________________________________
From: "lbarcala@freeresearch.org" <lbarcala@freeresearch.org>
To: java-user@lucene.apache.org
Sent: Thursday, November 13, 2008 11:35:13 AM
Subject: Re: About counting term hits

> Mario,
>
> Does this help:
> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/TermFreqVector.html
>
> Plus:
> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/IndexReader.html#method_summary
> (look for "getTerm.Freq...")
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>

So, if I undertand the solution, the main steps to do what I propose is:

1) To obtain the documents which match the query (documents which include
the word "house")
2) To loop throw matching documents to access the IndexReader for
obtaining their term frequencies.
3) To obtain from TermFreqVector the frequencies of the Term ("house") to
calculate the result.

And, if it is a very frequent query and there are much documents (> 10.000),
would LUCENE solve it in a reasonable time? A query might match several
hundred documents.

Thank you,

  Mario Barcala

>>> Hello:
>>>
>>> I am new to LUCENE and I am testing some issues about it. I can
>>> retrieve
>>> the number of documents which satisfies a query, but I don't find how
>>> to
>>> obtain the number of terms which match it.
>>>
>>> For example, if I search for the word "house", I want to obtain the
>>> number of times the word occurs (not the number of documents).
>>>
>>> Is it possible to do it in LUCENE?
>>>
>>> Thanks in advance,
>>>
>>>  Mario Barcala
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> ----------------------------------------
>> "Help Ever Hurt Never"- Baba
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message