lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Preston <gpres...@marinsoftware.com>
Subject Re: Autosuggest on very large index
Date Tue, 20 Aug 2013 19:51:22 GMT
DocValues looks interesting, a non-inverted field.  I'll play with it
a bit and see how it works.  Thanks for the suggestion.

I don't know how many total terms we've got, but each "document" is
only 2-5 words/terms on average, and there is a TON of overlap between
docs.



-Greg


On Tue, Aug 20, 2013 at 11:38 AM, Jack Krupansky
<jack@basetechnology.com> wrote:
> Sounds like a problem for DocValues - assuming the number of unique values
> fits reasonably in memory to avoid I/O.
>
> How many unique values do you have or contemplate for two your billion
> documents?
>
> Two possibilities:
>
> 1. You need a lot more hardware.
> 2. You need to scale back your ambitions.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Greg Preston
> Sent: Tuesday, August 20, 2013 2:00 PM
>
> To: solr-user@lucene.apache.org
> Subject: Autosuggest on very large index
>
> Using 4.4.0 -
>
> I would like to be able to do an autosuggest query against one of the
> fields in our index and have the results be limited by an fq.
>
> I can get exactly the results I want with a facet query using a
> facet.prefix, but the first query takes ~5 minutes to run on our QA
> env (~240M docs).  I'm afraid to attempt it on prod (~2B docs).
> Subsequent queries are sufficiently fast (~500ms).
>
> I'm assuming the first query is uninverting the field.  Is there any
> way to mark that field so that an uninverted copy is maintained as
> updates come in?  We plan to soft commit every 5 minutes, and we'd
> prefer to not be continuously uninverting this one field.
>
> Or is there a better way to do what I'm trying to do?  I've looked at
> the spellcheck component a little bit, but it looks like I can't
> filter results by fq.  The fq I'm using is based on which client is
> logged in, and we can't autosuggest terms from one client to another.
>
> Thanks.
>
> -Greg

Mime
View raw message