lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Filter to retrieve random documents without specific terms ?
Date Tue, 29 Mar 2011 19:48:19 GMT
Here are a couple of ideas.

Plan A.

Think of a number, say 10, retrieve n * 10 docids in your search and
then loop round java.util.Random.nextInt(n * 10) until you've got
enough.

Plan B.

Reverse your MUST NOT search to get a list of docids that you don't
want, then loop round Random.nextInt(indexreader.numDocs()), selecting
those that are not deleted (!indexreader.isDeleted(docid)) and are not
in your exclusion list.


I'm sure there are other ways, probably better.


--
Ian.


On Tue, Mar 29, 2011 at 8:00 PM, Patrick Diviacco
<patrick.diviacco@gmail.com> wrote:
> Ok I've solved the first part of the problem. I'm now selecting all
> documents that do not contain a given term with a BooleanFilter
> and FilterClause, MUST NOT.
>
> I still have to understand how to retrieve random documents and limit the
> number of retrieved docs to N.
>
> thanks
>
> On 29 March 2011 20:40, Patrick Diviacco <patrick.diviacco@gmail.com> wrote:
>
>> Is there a Filter to get a limited number of random collection docs from
>> the index which DO NOT contain a specific term ?
>>
>> i.e. term="pizza"
>>
>> I want to run the query against 10 random documents of the collection that
>> do not contain the term "pizza".
>>
>> thanks
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message