lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Large disjunction query practices
Date Mon, 09 Jun 2014 13:07:44 GMT
Are they expecting relevancy ranking or merely seeking to a bulk read of 
those documents? Please detail what the user is trying to accomplish with 
such a monster list of IDs.

Generally, queries of more than a few dozen terms are a bad idea. If for no 
other reason than that if you need to debug them or examine the results by 
hand, it will be a nightmare. OTOH, some people really love drama and just 
can't get enough of it.

The general guidance is to keep requests and responses relatively small. 
Keep network traffic down. Keep compute intensity down. Keep memory 
requirements down.

Small is better.

-- Jack Krupansky

-----Original Message----- 
From: Joe Gresock
Sent: Monday, June 9, 2014 8:50 AM
To: solr-user@lucene.apache.org
Subject: Large disjunction query practices

I'm wondering what the best practice for large disjunct queries in Solr is.
A user wants to submit a query for several hundred thousand terms, like:
(term1 OR term2 OR ... term500,000)

I know it might be better to break this up into multiple queries that can
be merged on the user's end, but I'm wondering if there's guidance for a
good limit of OR'ed terms per query.  100 terms?  200? 500?  Any idea what
kinds of data set or memory limitations might govern this threshold?

Thanks,
Joe

-- 
I know what it is to be in need, and I know what it is to have plenty.  I
have learned the secret of being content in any and every situation,
whether well fed or hungry, whether living in plenty or in want.  I can do
all this through him who gives me strength.    *-Philippians 4:12-13* 


Mime
View raw message