lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jamie <>
Subject Re: Lucene TermsFilter lookup slow
Date Sun, 09 Aug 2015 07:17:33 GMT

Thank you kindly for the reply. I am using Lucene v4.10.4. Are the 
optimization you refer to, available in this version?

We haven't yet upgraded to Lucene 5 as there appear to be many API changes.


On 2015/08/08 5:13 PM, Michael McCandless wrote:
> Which version of Lucene are you using?  Newer versions have optimized
> the "primary key" use case somewhat...
> Mike McCandless
> On Sat, Aug 8, 2015 at 8:32 AM, jamie <> wrote:
>> Greetings
>> Our app primarily uses Lucene for its intended purpose i.e. to search across
>> large amounts of unstructured text. However, recently our requirement
>> expanded to perform look-ups on specific documents in the index based on
>> associated custom defined unique keys. For our purposes, a unique key is the
>> string representation of a 128 bit murmur hash, stored in a Lucene field
>> named uid.  We are currently using the TermsFilter to lookup Documents in
>> the Lucene index as follows:
>> List<Term> terms = new LinkedList<>();
>>              for (String id : ids) {
>>                  terms.add(new Term("uid", id));
>> }
>> TermsFilter idFilter = new TermsFilter(terms);
>> ... search logic...
>> At any time we may need to lookup say a couple of thousand documents. Our
>> problem is one of performance. On very large indexes with 30 million records
>> or more, the lookup can be excruciatingly slow. At this stage, its not
>> practical for us to move the data over to fit for purpose database, nor
>> change the uid field to a numeric type. I fully appreciate the fact that
>> Lucene is not designed to be a database, however, is there anything we can
>> do to improve the performance of these look-ups?
>> Much appreciate
>> Jamie

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message