lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Keegan" <peterlkee...@gmail.com>
Subject Re: Announcement: Lucene powering Monster job search index (Beta)
Date Fri, 16 Mar 2007 22:05:42 GMT
Note: this is a reply to a posting to java-dev  --Peter

Eric,

> Now that it is live, is performance pretty good?

Performance is outstanding. Each server can easily handle well over 100 qps
on an index of over 800K documents. There are several servers (4 dual core
(8 CPU) Opteron) supporting the query load and we have backup servers for
disaster recovery. For a few hours one day, all job search query traffic for
the entire site was being handled by a single server - with no noticable
latency!

>Are you using dotLucene or a webservice tier and java?

We are using Java Lucene on dedicated servers.


>How did you implement your bounding box for the searching? It sounds like
you do this outside of lucene and return a custom hitcollector.

The 'bounding box' is merely the conjunction of 2 numeric range searches.
It's really not that hard to do - I think there has been discussion of this
elsewhere in this group. We use (not 'return') a custom HitCollector to
exclude hits that aren't in the bounding box. I tried to explain this in a
reply earlier today, but if I failed let me know.

> Why not use a rangequery or functionquery for the basic bounding before
sorting

Basically, 'RangeQuery' doesn't offer sufficient performance. We have
implemented our own 'numeric value' search 'next to Lucene' (I think I like
this better than 'outside of Lucene' ;-)).  FunctionQuery could be used if
you wanted the jobs sorted by a combination of keywords and distance. Our
users (apparently) expect the jobs to be sorted strictly by distance on a
radius search.

================================================================

Peter

Hello Peter,

Now that the monster lucene search is live, is performance pretty good? Are
you still running it on a single 8 core server? Can you give me a rough idea
on the number of queries you can handle/second and the number of docs in the
index? Are you using dotLucene or a webservice tier and java?

How did you implement your bounding box for the searching? It sounds like
you do this outside of lucene and return a custom hitcollector. Why not use
a rangequery or functionquery for the basic bounding before sorting?

Thanks,
Eric

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message