lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Message via your Google Profile: Lucene limits
Date Mon, 03 Jun 2013 04:51:02 GMT
Hi Oded,

These times sound way too high, even for really hard queries. Can you share
a bit about how you index the documents and what do they contain?
Specifically:

   - How many facet dimensions per-document do you have?
   - Are the dimensions unique, i.e. a document has one category from a
   dimension, or can there be multiple ones?
   - Are the dimensions hierarchical or flat?
   - Is the query a MatchAllDocsQuery or something else? If so, how many
   documents (approximately) do you estimate it matches?
   - Do you use any of the following facet features: partitions, sampling,
   complements?

Note that in 4.2 the facet package was pretty much rewritten (most of its
parts) with performance improvements of up to x4. Is it possible that you
migrate your code to the latest Lucene version?

Shai


On Sun, Jun 2, 2013 at 11:56 PM, Oded Sofer <yotamchook@gmail.com> wrote:

> I am trying to implement Lucene on high volume. We implemented Lucene 4.1,
> The search with Facets is very slow and consume high amount of RAM and
> cause Swapping. We have 7gb of index, it includes 70 millions of Documents
> (pretty short doc). The Search (inc. Facets) takes 30-60 seconds which is
> way too long for our needs. We ran it on x3550 (Intel) server , 12gb. We
> have other processes running on the same machine. Is it reasonable? is it
> realistic to expect 1-3 seconds search of both hits and facets (we have 7-8
> facets)? in some cases the number of hits is 50Millions or so. Thank you,
> hope it is okay to email you this question.
> We are using Lucene (not Solr -- I don't know why; we will review moving to
> Solr soon).
> We are allocating 4-6 gb for Java maximum (we wish we could limit it to 2-3
> gb max or for short time).
>
>
>
> On Sun, Jun 2, 2013 at 10:10 PM, Michael McCandless <mikemccand@gmail.com
> >wrote:
>
> > Hi Oded,
> >
> > Can you email java-user@lucene.apache.org and ask this?
> >
> > Are you using the lucene facet module (not Solr)?  If so, it should not
> > consume that much RAM, unless you have a high number of unique facet
> > labels.  And 30-60 seconds seems too long, unless you are running a
> > particularly hard query.
> >
> >
> > On Sun, Jun 2, 2013 at 2:54 PM, Oded Sofer <yotamchook@gmail.com> wrote:
> >
> >> Hi Michael, I am trying to implement Lucene on high volume. We
> >> implemented Lucene 4.1, The search with Facets is very slow and consume
> >> high amount of RAM and cause Swapping. We have 7gb of index, it
> includes 70
> >> millions of Documents (pretty short doc). The Search (inc. Facets) takes
> >> 30-60 seconds which is way too long for our needs. We ran it on x3550
> >> (Intel) server , 12gb. We have other processes running on the same
> machine.
> >> Is it reasonable? is it realistic to expect 1-3 seconds search of both
> hits
> >> and facets (we have 7-8 facets)? in some cases the number of hits is
> >> 50Millions or so. Thank you, hope it is okay to email you this question.
> >> Yotam Oded Sofer
> >>
> >>
> >>
> >>
> >> -------------------------------------------------------------
> >>
> >> This message was sent to you from your Google profile. The sender does
> >> not have your email address.
> >>
> >> If you no longer wish to receive messages from your Google profile, you
> >> may edit your settings<
> https://profiles.google.com/112759599082866346694/edit>
> >> .
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message