lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Counting search results
Date Tue, 15 Sep 2009 15:13:47 GMT
Hmm, so if you wanna use the Filter to narrow down the search results
you could use it in the while loop like this:

BitSet set = filter.bits(reader);
 int numDocs
TermDocs termDocs = reader.termDocs(new Term("myField", "myTerm"));
while (termDocs.next()) {
  if(set.get(termDocs.doc()))
    numDocs++;
}

would that help?

simon
>>
On Tue, Sep 15, 2009 at 5:01 PM, Mathias Bank <mathias.bank@gmail.com> wrote:
> Hello,
>
> This seams to be a similar solution like:
>
> Term t = new Term(fieldname, term);
> int count = searcher.docFreq(t);
>
> The problem is, that in this situation it is not possible to apply a
> filter object. If I don't wanna use this filter object, I would have
> to use a complex search query, wich is - again - very slow. So,
> unfortunatelly, your solution does not help.
>
> Mathias
>
> 2009/9/15 Simon Willnauer <simon.willnauer@googlemail.com>:
>> Did you try:
>> int numDocs
>> TermDocs termDocs = reader.termDocs(new Term("myField", "myTerm"));
>> while (termDocs.next()) { numDocs++; }
>>
>> simon
>>
>> On Tue, Sep 15, 2009 at 2:19 PM, Mathias Bank <mathias.bank@gmail.com> wrote:
>>> Hello,
>>>
>>> I'm trying to find the number of documents for a specific term to
>>> create text statistics. I'm not interested in ordering the results or
>>> even recieving the first result. I just need the number of results.
>>>
>>> Currently, I'm trying to do this by using the lucene searcher class:
>>>
>>> IndexSearcher searcher = new IndexSearcher(reader);
>>> String queryString = fieldname+":" + term;
>>> QueryParser parser = new QueryParser(fieldname, new GermanAnalyzer());
>>> TopDocs d = searcher.search(parser.parse(queryString), filter, 1);
>>> int count = d.totalHits;
>>>
>>> The problem is, that there is a large index (optimized) with > 8 mio.
>>> entries. One search could return a large number of search results (> 1
>>> mio). Currently these search tasks take more than 15 secunds.
>>>
>>> The question is: is there any way to get the number of search results
>>> faster? I think, that it could be optimized by not using a Weight
>>> object (order is not interesting), but I haven't seen a way to do
>>> this.
>>>
>>> I hope, someone has already solved this problem.
>>>
>>> Mathias
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message