lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <yo...@apache.org>
Subject Re: Incremental updates/Sorting problem
Date Tue, 08 Aug 2006 14:39:49 GMT
On 8/8/06, bo_b <bo@staff.jubii.dk> wrote:
> As mentioned in another post i am trying to index a vbulletin database
> containing roughly 7 million posts. The very first query where I apply
> sorting after a full indexing, seems to take roughly <QTime>264998</QTime>
> ms. Subsequent searches are fast.
>
> I figure the reason is as Chris explained
> here(http://www.mail-archive.com/solr-user@lucene.apache.org/msg00457.html)
> that
>
> "Sorting on a field requires building a FieldCache for every document --
> regardless of how many documents match your query.  This cache is reused
> for all searches thta sort on that field."
>
> However my problem is that I would like to be able to incrementally add new
> postings to the index, as they occur.

> And it appears that if i add just 1
> post, and do a <commit> that solr/lucene rebuilds FieldCaches for the entire
> index, not just the newly added posts. Thus rendering my index unsearchable
> for the next roughly 264 seconds(at least for sorting queries)..

Warming (either normal or auto-warming) will solve the problem of the
long first search.
Warming is done in the background, so no "real" live queries will see
that long delay.

That said, 264 seconds is a *long* time to build a FieldCache entry,
even for 6M documents.  Make sure that you have enough heap size and
that running out of memory isn't causing the GC to hog the CPU.

That doesn't solve the <commit> after every <add> problem though.
That's not the type of thing that Lucene (and Solr) are optimized for.
 Most search collections can tollerate a few minutes of lag until new
documents become searchable.

-Yonik

Mime
View raw message