lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: lucene (search) performance tuning
Date Sat, 26 May 2012 14:32:45 GMT
I'm using disjunction (OR) query. unfortunately all of the clauses are
optional

On Sat, May 26, 2012 at 4:38 AM, Simon Willnauer <
simon.willnauer@googlemail.com> wrote:

> On Sat, May 26, 2012 at 2:59 AM, Yang <teddyyyy123@gmail.com> wrote:
> > I tested with more threads / processes. indeed this is completely
> > cpu-bound, since running 1 thread gives the same latency as 4 threads (my
> > box has 4 cores)
> >
> >
> > given this, is there any way to simplify the scoring computation (i'm
> only
> > using lucene as a first level "rough" search, so the search quality is
> not
> > a huge issue here) , so that, for example, fewer fields are evaluated or
> a
> > simpler scoring function is used?
>
> are you using disjunction or conjunction queries? Can you make some
> parts of the query mandatory?
>
> simon
> >
> > thanks
> > Yang
> >
> > On Fri, May 25, 2012 at 5:47 PM, Yang <teddyyyy123@gmail.com> wrote:
> >
> >> thanks a lot guys
> >>
> >>
> >> On Tue, May 22, 2012 at 1:34 AM, Ian Lea <ian.lea@gmail.com> wrote:
> >>
> >>> Lots of good tips in
> >>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from
> >>> the FAQ.
> >>>
> >>>
> >>> --
> >>> Ian.
> >>>
> >>>
> >>> On Tue, May 22, 2012 at 2:08 AM, Li Li <fancyerii@gmail.com> wrote:
> >>> > something wrong when writing in my android client.
> >>> > if RAMDirectory do not help, i think the bottleneck is cpu. you may
> try
> >>> to
> >>> > tune jvm but i do not expect much improvement.
> >>> > the best one is splitting your index into 2 or more smaller ones.
> >>> > you can then use solr s distributed searching.
> >>> > if the cpu is not fully used, yuo can do this in one physical machine
> >>> >
> >>> > 在 2012-5-22 上午8:50,"Li Li" <fancyerii@gmail.com>写道:
> >>> >>
> >>> >>
> >>> >> 在 2012-5-22 凌晨4:59,"Yang" <teddyyyy123@gmail.com>写道:
> >>> >>
> >>> >> >
> >>> >> > I'm trying to make my search faster. right now a query like
> >>> >> >
> >>> >> > name:Joe Moe Pizza   address:77 main street  city:San Francisco
> >>> >> >is this a conjunction query or a disjunction query?
> >>> >>
> >>> >> > in a index with 20mil such short business descriptions (total
size
> >>> > about 3GB) takes about 100--200ms.
> >>> >> >20m is not a small size, how many results for a query in average?
> >>> >>
> >>> >> > I profiled the query, most time is spent in TermScorer.score(),
> as is
> >>> > shown by the attached yourkit screenshot.
> >>> >> >that's true, for a query, matching and scoring is very time
> consuming
> >>> > and cpu intensive. another one is io for reading postings.
> >>> >>
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > I tried loading the index onto tmpfs (in-memory block device),
and
> >>> also
> >>> > tried RAMDirectory, neither helps much.
> >>> >> >if that is true. it seems that io is not the
> >>> >> > I am reading
> >>> > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf
> >>> >> > it mentions
> >>> >> > Size
> >>> >> > – Stopword removal
> >>> >> > – Stemming
> >>> >> > • Lucene has a number of stemmers available
> >>> >> > • Light versus Aggressive
> >>> >> > • May prevent fine-grained matches in some cases
> >>> >> > – Not a linear factor (usually) due to index compression
> >>> >> >
> >>> >> > so for "stopword removal", I'm already using the standard
> analyzer,
> >>> so
> >>> > stop word removal is already included, right?
> >>> >> >
> >>> >> > also generally any other tricks to try for reducing the search
> >>> latency?
> >>> >> >
> >>> >> > Thanks!
> >>> >> > Yang
> >>> >> >
> >>> >> >
> >>> >> >
> ---------------------------------------------------------------------
> >>> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message