lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: lucene (search) performance tuning
Date Tue, 29 May 2012 00:54:43 GMT
Can you use filter queries? Filters short-circuit a lot of search
processing. "City:San Francisco" is a classic filter - it is a small
part of the documents and it is reused a lot.

On Sat, May 26, 2012 at 7:32 AM, Yang <teddyyyy123@gmail.com> wrote:
> I'm using disjunction (OR) query. unfortunately all of the clauses are
> optional
>
> On Sat, May 26, 2012 at 4:38 AM, Simon Willnauer <
> simon.willnauer@googlemail.com> wrote:
>
>> On Sat, May 26, 2012 at 2:59 AM, Yang <teddyyyy123@gmail.com> wrote:
>> > I tested with more threads / processes. indeed this is completely
>> > cpu-bound, since running 1 thread gives the same latency as 4 threads (my
>> > box has 4 cores)
>> >
>> >
>> > given this, is there any way to simplify the scoring computation (i'm
>> only
>> > using lucene as a first level "rough" search, so the search quality is
>> not
>> > a huge issue here) , so that, for example, fewer fields are evaluated or
>> a
>> > simpler scoring function is used?
>>
>> are you using disjunction or conjunction queries? Can you make some
>> parts of the query mandatory?
>>
>> simon
>> >
>> > thanks
>> > Yang
>> >
>> > On Fri, May 25, 2012 at 5:47 PM, Yang <teddyyyy123@gmail.com> wrote:
>> >
>> >> thanks a lot guys
>> >>
>> >>
>> >> On Tue, May 22, 2012 at 1:34 AM, Ian Lea <ian.lea@gmail.com> wrote:
>> >>
>> >>> Lots of good tips in
>> >>> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from
>> >>> the FAQ.
>> >>>
>> >>>
>> >>> --
>> >>> Ian.
>> >>>
>> >>>
>> >>> On Tue, May 22, 2012 at 2:08 AM, Li Li <fancyerii@gmail.com> wrote:
>> >>> > something wrong when writing in my android client.
>> >>> > if RAMDirectory do not help, i think the bottleneck is cpu. you
may
>> try
>> >>> to
>> >>> > tune jvm but i do not expect much improvement.
>> >>> > the best one is splitting your index into 2 or more smaller ones.
>> >>> > you can then use solr s distributed searching.
>> >>> > if the cpu is not fully used, yuo can do this in one physical machine
>> >>> >
>> >>> > 在 2012-5-22 上午8:50,"Li Li" <fancyerii@gmail.com>写道:
>> >>> >>
>> >>> >>
>> >>> >> 在 2012-5-22 凌晨4:59,"Yang" <teddyyyy123@gmail.com>写道:
>> >>> >>
>> >>> >> >
>> >>> >> > I'm trying to make my search faster. right now a query
like
>> >>> >> >
>> >>> >> > name:Joe Moe Pizza   address:77 main street  city:San
Francisco
>> >>> >> >is this a conjunction query or a disjunction query?
>> >>> >>
>> >>> >> > in a index with 20mil such short business descriptions
(total size
>> >>> > about 3GB) takes about 100--200ms.
>> >>> >> >20m is not a small size, how many results for a query in
average?
>> >>> >>
>> >>> >> > I profiled the query, most time is spent in TermScorer.score(),
>> as is
>> >>> > shown by the attached yourkit screenshot.
>> >>> >> >that's true, for a query, matching and scoring is very
time
>> consuming
>> >>> > and cpu intensive. another one is io for reading postings.
>> >>> >>
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > I tried loading the index onto tmpfs (in-memory block
device), and
>> >>> also
>> >>> > tried RAMDirectory, neither helps much.
>> >>> >> >if that is true. it seems that io is not the
>> >>> >> > I am reading
>> >>> > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf
>> >>> >> > it mentions
>> >>> >> > Size
>> >>> >> > – Stopword removal
>> >>> >> > – Stemming
>> >>> >> > • Lucene has a number of stemmers available
>> >>> >> > • Light versus Aggressive
>> >>> >> > • May prevent fine-grained matches in some cases
>> >>> >> > – Not a linear factor (usually) due to index compression
>> >>> >> >
>> >>> >> > so for "stopword removal", I'm already using the standard
>> analyzer,
>> >>> so
>> >>> > stop word removal is already included, right?
>> >>> >> >
>> >>> >> > also generally any other tricks to try for reducing the
search
>> >>> latency?
>> >>> >> >
>> >>> >> > Thanks!
>> >>> >> > Yang
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> ---------------------------------------------------------------------
>> >>> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >>> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>>
>> >>>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>



-- 
Lance Norskog
goksron@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message