lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Bank <mathias.b...@gmail.com>
Subject Re: Counting search results
Date Thu, 17 Sep 2009 14:25:50 GMT
Hello,

I have tried your method, but it doesn't work.

set will be null after applying

BitSet set = filter.bits(reader);

I haven't found any reason for this.

Additionally, the bits method is deprecated and it is mentioned to use
"getDocIdSet". But this set does only provide an iterator, no hash
checks are possible.

Are there any other possibilities to improve speed?

Mathias


Am 15.09.2009 17:13 schrieb Simon Willnauer <simon.willnauer@googlemail.com>:
> Hmm, so if you wanna use the Filter to narrow down the search results
>
> you could use it in the while loop like this:
>
>
>
> BitSet set = filter.bits(reader);
>
>  int numDocs
>
> TermDocs termDocs = reader.termDocs(new Term("myField", "myTerm"));
>
> while (termDocs.next()) {
>
>  if(set.get(termDocs.doc()))
>
>    numDocs++;
>
> }
>
>
>
> would that help?
>
>
>
> simon
>
> >>
>
> On Tue, Sep 15, 2009 at 5:01 PM, Mathias Bank mathias.bank@gmail.com> wrote:
>
> > Hello,
>
> >
>
> > This seams to be a similar solution like:
>
> >
>
> > Term t = new Term(fieldname, term);
>
> > int count = searcher.docFreq(t);
>
> >
>
> > The problem is, that in this situation it is not possible to apply a
>
> > filter object. If I don't wanna use this filter object, I would have
>
> > to use a complex search query, wich is - again - very slow. So,
>
> > unfortunatelly, your solution does not help.
>
> >
>
> > Mathias
>
> >
>
> > 2009/9/15 Simon Willnauer simon.willnauer@googlemail.com>:
>
> >> Did you try:
>
> >> int numDocs
>
> >> TermDocs termDocs = reader.termDocs(new Term("myField", "myTerm"));
>
> >> while (termDocs.next()) { numDocs++; }
>
> >>
>
> >> simon
>
> >>
>
> >> On Tue, Sep 15, 2009 at 2:19 PM, Mathias Bank mathias.bank@gmail.com> wrote:
>
> >>> Hello,
>
> >>>
>
> >>> I'm trying to find the number of documents for a specific term to
>
> >>> create text statistics. I'm not interested in ordering the results or
>
> >>> even recieving the first result. I just need the number of results.
>
> >>>
>
> >>> Currently, I'm trying to do this by using the lucene searcher class:
>
> >>>
>
> >>> IndexSearcher searcher = new IndexSearcher(reader);
>
> >>> String queryString = fieldname+":" + term;
>
> >>> QueryParser parser = new QueryParser(fieldname, new GermanAnalyzer());
>
> >>> TopDocs d = searcher.search(parser.parse(queryString), filter, 1);
>
> >>> int count = d.totalHits;
>
> >>>
>
> >>> The problem is, that there is a large index (optimized) with > 8 mio.
>
> >>> entries. One search could return a large number of search results (>
1
>
> >>> mio). Currently these search tasks take more than 15 secunds.
>
> >>>
>
> >>> The question is: is there any way to get the number of search results
>
> >>> faster? I think, that it could be optimized by not using a Weight
>
> >>> object (order is not interesting), but I haven't seen a way to do
>
> >>> this.
>
> >>>
>
> >>> I hope, someone has already solved this problem.
>
> >>>
>
> >>> Mathias
>
> >>>
>
> >>> ---------------------------------------------------------------------
>
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> >>>
>
> >>>
>
> >>
>
> >> ---------------------------------------------------------------------
>
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> >>
>
> >>
>
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message