lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: on-the-fly "filters" from docID lists
Date Fri, 23 Jul 2010 00:55:55 GMT
Well, Lucene can apply such a filter rather quickly; but, your custom
code first has to build it... so it's really a question of whether
your custom code can build up / iterate the filter scalably.

Mike

On Thu, Jul 22, 2010 at 4:37 PM, Burton-West, Tom <tburtonw@umich.edu> wrote:
> Hi Mike and Martin,
>
> We have a similar use-case.   Is there a scalability/performance issue with the getDocIdSet
having to iterate through hundreds of thousands of docIDs?
>
> Tom Burton-West
> http://www.hathitrust.org/blogs/large-scale-search
>
> -----Original Message-----
> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> Sent: Thursday, July 22, 2010 5:20 AM
> To: java-user@lucene.apache.org
> Subject: Re: on-the-fly "filters" from docID lists
>
> It sounds like you should implement a custom Filter?
>
> Its getDocIdSet would consult your foreign key-value store and iterate
> through the allowed docIDs, per segment.
>
> Mike
>
> On Wed, Jul 21, 2010 at 8:37 AM, Martin J <martinj.engine@gmail.com> wrote:
>> Hello, we are trying to implement a query type for Lucene (with eventual
>> target being Solr) where the query string passed in needs to be "filtered"
>> through a large list of document IDs per user. We can't store the user ID
>> information in the lucene index per document so we were planning to pull the
>> list of documents owned by user X from a key-value store at query time and
>> then build some sort of filter in memory before doing the Lucene/Solr query.
>> For example:
>>
>> content:"cars" user_id:X567
>>
>> would first pull the list of docIDs that user_id:X567 has "access" to from a
>> keyvalue store and then we'd query the main index with content:"cars" but
>> only allow the docIDs that came back to be part of the response. The list of
>> docIDs can near the hundreds of thousands.
>>
>> What should I be looking at to implement such a feature?
>>
>> Thank you
>> Martin
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message