lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sami Dalouche <sko...@free.fr>
Subject Re: Lucene search optimization
Date Wed, 31 May 2006 20:29:26 GMT
Hi,

thanks for the tip !! Yes, basically, I would like to reduce the number
of comparisons.  Using this prefix length seems doable for my problem..
(even though I'm not 100% sure it is appropriate, this has to be
investigated)

Is there a way to use this prefix length (or something similar) on some
other property than the city ? In fact, I can also index the Country, so
if this prefix length could be useable on the country, this could easily
divide the search space by 400, which is way better than /26...

 
Any idea ?
Sami Dalouche


Le mercredi 31 mai 2006 à 20:53 +0100, markharw00d a écrit :
> >>I tried the cityName:city~0.8, and it is still not fast enough..
> >>something around 2 seconds... to return only 2 results...
> 
> OK, so we trimmed down the search terms we actually used in the query but I suspect what
you are seeing is the effect of having to perform edit-distance comparisons on ALL town names
to get to this shortlist. If this is the case then you'll probably be seeing a lot of CPU
activity. One way of avoiding this is to set the "prefix length" parameter on fuzzy queries
to at least one. This determines if you are comparing Rambouillet with ALL terms (as the default
zero prefix length setting does) or just those beginning with "R".
> 
> Assuming an even spread of town names to letters that would cut the computation down
to 1/26th of the original cost.
> 
> Cheers
> Mark
> 
> 
> 
> 
> Sami Dalouche wrote:
> 
> >Hi,
> >
> >Compass offers me any kind of control Lucene does. it gives access to
> >the low level Lucene API if you want too, so if you have a nice way of
> >optimizing it, I can have Compass adapt to that.
> >
> >
> >I tried the cityName:city~0.8, and it is still not fast enough..
> >something around 2 seconds... to return only 2 results...
> >(city:Rambouillet~0.8)
> >
> >Sami Dalouche
> >
> >Le mercredi 31 mai 2006 à 09:28 +0100, mark harwood a écrit :
> >  
> >
> >>>>Actually, I am not using Lucene directly, but a
> >>>>        
> >>>>
> >>wrapper called compass
> >>
> >>
> >>I don't know what controls it offers you then.
> >>One option which could offer a speed up is to raise
> >>the minimum quality match threshold above the default
> >>of 0.5 and use a query string like this:
> >>
> >>  cityName:London~0.8 
> >>
> >>This would reduce the number of alternative terms
> >>considered and therefore the query time.
> >>
> >>
> >>--- Sami Dalouche <skoobi@free.fr> wrote:
> >>
> >>    
> >>
> >>>Hi,
> >>>
> >>>1) Actually, I am not using Lucene directly, but a
> >>>wrapper called
> >>>compass. I am using the find() method of the
> >>>CompassSession, which code
> >>>is :
> >>>public CompassHits find(String query) throws
> >>>CompassException {
> >>>        return
> >>>
> >>>      
> >>>
> >>createQueryBuilder().queryString(query).toQuery().hits();
> >>    
> >>
> >>>    }
> >>>And all of these objects are pure wrappers around
> >>>lucene equivalents,
> >>>nothing more.
> >>>
> >>>
> >>>2) What I am timing is only the find call :
> >>>-- start timer
> >>>CompassHits hits = compassSession.find("cityName:"+
> >>>name+"~");
> >>>-- stop timer
> >>>
> >>>3) I am not sorting anything, but lucene is
> >>>returning the hits by
> >>>relevance. Does this count as sorting ?
> >>>
> >>>4) I tried to time the thing for ~10 queries, and
> >>>the results are
> >>>roughly the same. Can go down to 2 seconds, which is
> >>>still way too
> >>>much...
> >>>
> >>>Thanks for helping
> >>>sami Dalouche
> >>>
> >>>On Tue, 2006-05-30 at 13:58 -0700, Chris Hostetter
> >>>wrote:
> >>>      
> >>>
> >>>>: Fuzzy searching against this property takes
> >>>>        
> >>>>
> >>>around 3 seconds, which is
> >>>      
> >>>
> >>>>: way too much for what I plan to do, so I am
> >>>>        
> >>>>
> >>>considering the possible
> >>>      
> >>>
> >>>>whenever anyone has a question about how to speed
> >>>>        
> >>>>
> >>>up a search, and the
> >>>      
> >>>
> >>>>current amount of time the search takes is more
> >>>>        
> >>>>
> >>>then a second, there are a
> >>>      
> >>>
> >>>>few questions i allways want to ask:
> >>>>
> >>>> 1) what method exactly on the Searcher interface
> >>>>        
> >>>>
> >>>are you using the
> >>>      
> >>>
> >>>>    execute the search?
> >>>> 2) what exactly are you timing? (the time the
> >>>>        
> >>>>
> >>>search method call takes?,
> >>>      
> >>>
> >>>>    the time it takes you to iterate over the
> >>>>        
> >>>>
> >>>results? etc...)
> >>>      
> >>>
> >>>> 3) are you sorting by any particular field?
> >>>> 4) are you reusing the Searcher instance for more
> >>>>        
> >>>>
> >>>then one query?   are
> >>>      
> >>>
> >>>>    you timing more then one query and taking the
> >>>>        
> >>>>
> >>>average?
> >>>      
> >>>
> >>>>-Hoss
> >>>>
> >>>>
> >>>>
> >>>>        
> >>>>
> >>---------------------------------------------------------------------
> >>    
> >>
> >>>>To unsubscribe, e-mail:
> >>>>        
> >>>>
> >>>java-user-unsubscribe@lucene.apache.org
> >>>      
> >>>
> >>>>For additional commands, e-mail:
> >>>>        
> >>>>
> >>>java-user-help@lucene.apache.org
> >>>      
> >>>
> >>>
> >>>      
> >>>
> >>---------------------------------------------------------------------
> >>    
> >>
> >>>To unsubscribe, e-mail:
> >>>java-user-unsubscribe@lucene.apache.org
> >>>For additional commands, e-mail:
> >>>java-user-help@lucene.apache.org
> >>>
> >>>
> >>>      
> >>>
> >>
> >>		
> >>___________________________________________________________ 
> >>The all-new Yahoo! Mail goes wherever you go - free your email address from your
Internet provider. http://uk.docs.yahoo.com/nowyoucan.html
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>    
> >>
> >
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> >  
> >
> 
> 
> Send instant messages to your online friends http://uk.messenger.yahoo.com 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message