lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Magoffin" <apache....@msqr.us>
Subject Re: efficiently finding all terms used on a particular field withinDocuments matching a query
Date Fri, 11 Nov 2005 00:30:08 GMT
> : I'm wondering if there a more efficient way to accomplish this?
>
> I believe there is -- provided the terms are index.
>
> 1) Get yourself a BitSet representing the Documents you are interested in
> (you mentioned having a a date range, you can either use a RangeFilter nad
> call the bits method directly, or you can do a search using a
> HitCollector)
>
> 2) Look at the code that acctually makes RangeFilter work.  It iterates
> over a TermEnum between a low and high value.  for each term it finds, it
> uses a TermDocs to record the docid.  You could do something very similar,
> looping over all terms in the field you want.  but instead of recording
> the docid, add the term to your Set object -- if and only if one of the
> docids from the TermDocs is in your BitSet from step #1 above.
>
>
> ...that should be faster then the appraoch you have now.

Thank you, I had thought a BitSet was appropriate here somehow, I'll work
on this approach.

-- m@


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message