Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 4115 invoked from network); 22 Jul 2010 09:20:50 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Jul 2010 09:20:50 -0000 Received: (qmail 30832 invoked by uid 500); 22 Jul 2010 09:20:48 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 30534 invoked by uid 500); 22 Jul 2010 09:20:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 30526 invoked by uid 99); 22 Jul 2010 09:20:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Jul 2010 09:20:44 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Jul 2010 09:20:36 +0000 Received: by wwa36 with SMTP id 36so1514531wwa.5 for ; Thu, 22 Jul 2010 02:20:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.153.208 with SMTP id l16mr1582590wbw.57.1279790416301; Thu, 22 Jul 2010 02:20:16 -0700 (PDT) Received: by 10.216.13.142 with HTTP; Thu, 22 Jul 2010 02:20:15 -0700 (PDT) In-Reply-To: References: Date: Thu, 22 Jul 2010 05:20:15 -0400 Message-ID: Subject: Re: on-the-fly "filters" from docID lists From: Michael McCandless To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org It sounds like you should implement a custom Filter? Its getDocIdSet would consult your foreign key-value store and iterate through the allowed docIDs, per segment. Mike On Wed, Jul 21, 2010 at 8:37 AM, Martin J wrote: > Hello, we are trying to implement a query type for Lucene (with eventual > target being Solr) where the query string passed in needs to be "filtered" > through a large list of document IDs per user. We can't store the user ID > information in the lucene index per document so we were planning to pull the > list of documents owned by user X from a key-value store at query time and > then build some sort of filter in memory before doing the Lucene/Solr query. > For example: > > content:"cars" user_id:X567 > > would first pull the list of docIDs that user_id:X567 has "access" to from a > keyvalue store and then we'd query the main index with content:"cars" but > only allow the docIDs that came back to be part of the response. The list of > docIDs can near the hundreds of thousands. > > What should I be looking at to implement such a feature? > > Thank you > Martin > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org