lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stoppelman <stop...@gmail.com>
Subject Re: Poor QPS with highlighting
Date Thu, 05 Feb 2009 20:47:47 GMT
On Thu, Feb 5, 2009 at 9:05 AM, Jason Rutherglen <jason.rutherglen@gmail.com
> wrote:

> Google uses dedicated highlighting servers.  Maybe this architecture would
> work for you.
>

What's your reference? I used to work at Google.


>
> On Mon, Feb 2, 2009 at 11:24 PM, Michael Stoppelman <stopman@gmail.com
> >wrote:
>
> > Hi all,
> >
> > My search backends are only able to eek out 13-15 qps even with the
> entire
> > index in memory (this makes it very expensive to scale). According to my
> > YourKit profiler 80% of the program's time ends up in highlighting. With
> > highlighting disabled my backend gets about 45-50 qps (cheaper scaling)!
> > We're using Mark's TokenSources contrib. to make reconstructing of the
> > document quicker. I was contemplating patching the index to store offsets
> > for every term (instead of just the ordinal positions) so that I could
> make
> > the highlighting faster (since you would know where you hit in the
> document
> > on the search pass). I saw this thread from 2004:
> > http://www.mail-archive.com/lucene-dev@jakarta.apache.org/msg04743.html-
> > which asks about adding offsets to the index but it was decided against
> > because it would make the index too large. I can totally understand this;
> > but as machines get more beefy it would probably be nice to make this
> > optional since having 15 qps vs 50qps is quite a trade-off right now. Are
> > other folks seeing this? My documents are quite big sometimes up to 300k
> > tokens. Also my document fields are compressed which is also a time sink
> > for
> > the cpu.
> >
> > Please let me know if you need more details, happy to share.
> >
> > Sincerely,
> > M
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message