lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: caching repeated OR'd terms
Date Sat, 08 May 2010 03:18:28 GMT
I would suggest benchmarking this before doing any more complex
design. A field with only 10k unique integer or string values will
search very very quickly.

On Thu, May 6, 2010 at 7:54 AM, Nagelberg, Kallin
<KNagelberg@globeandmail.com> wrote:
> Hey everyone,
>
> I'm having some difficulty figuring out the best way to optimize for a certain query
situation. My documents have a many-valued field that stores lists of IDs. All in all there
are probably about 10,000 distinct IDs throughout my index. I need to be able to query and
find all documents that contain a given set of IDs. Ie, I want to find all documents that
contain IDs 3, 202, 3030 or 505. Currently I'm implementing this like so:
>
> q= (myfield:3) OR (myfield:202) OR (myfield:3030) OR (myfield:505).
>
> It's possible that there could be upwards of hundreds of terms, although 90% of the time
it will be under 10. Ideally I would like to do this with a filter query, but I have read
that it is impossible to cache OR'd terms in a fq, though this feature may come soon. The
problem is that the combinations of OR'd terms will almost always be unique, so the query
cache will have a very low hit rate. It would be great if the individual terms could be cached
individually, but I'm not sure how to accomplish that.
>
> Any suggestions would be welcome!
> -Kallin Nagelberg
>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message