lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wunderw...@netflix.com>
Subject Re: Your valuable suggestion on autocomplete
Date Tue, 06 May 2008 15:21:31 GMT
I wrote a prefix map (ternary search tree) in Java and load it with
queries to Solr every two hours. That keeps the autocomplete and
search index in sync.

Our autocomplete gets over 25M hits per day, so we don't really
want to send all that traffic to Solr.

wunder

On 5/6/08 2:37 AM, "Nishant Soni" <nishant_soni_is@yahoo.com> wrote:

> Just FYI, we have also implemented a Trie approach (outside of solr, even
> though our mail search uses solr) at the link in the signature.
> 
> You can try out the auto-completion working on the comparison tool on the home
> page.
> 
> - nishant
> 
> www.reviewgist.com
> 
> 
>  
> 
> 
> ----- Original Message ----
> From: Vaijanath N. Rao <vaiju1981@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, May 6, 2008 12:43:25 PM
> Subject: Re: Your valuable suggestion on autocomplete
> 
> Hi Rantjil Bould,
> 
> I would suggest you to give a thought on Trie data structure which is
> used for auto-complete.  Hitting Solr for every prefix looks time
> consuming job, but I might be wrong. I have Trie implementation and it
> works very fast (of course it is in memory data structure unlike solr
> index which lies on disk)
> 
> --Thanks and Regards
> Vaijanath
> 
> 
> 
> Rantjil Bould wrote:
>> Hi Group,
>>              I have already got some valuable suggestions from group. Based
>> on that, I have come out with following process to finally implement
>> autocomplete like fetaure in my system
>> 1- Index the whole documents
>> 2- Extract all terms using indexReader's terms() method
>> 
>> I am getting terms like vl,vla,vlan,vlana,vlanan,vlanand. But I would like
>> to get absolute terms i.e. vlanand. The field definition in solr is
>> 
>> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>>       <analyzer type="index">
>>         <tokenizer class="solr.WhitespaceTokenizerFactory"></tokenizer>
>>         <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt" enablePositionIncrements="true"></filter>
>>         <filter class="solr.WordDelimiterFilterFactory"
>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"></filter>
>>         <filter class="solr.LowerCaseFilterFactory"></filter>
>>         <filter class="solr.EnglishPorterFilterFactory"
>> protected="protwords.txt"></filter>
>>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"></filter>
>>       </analyzer>
>>       <analyzer type="query">
>>         <tokenizer class="solr.WhitespaceTokenizerFactory"></tokenizer>
>>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> ignoreCase="true" expand="true"></filter>
>>         <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt"></filter>
>>         <filter class="solr.WordDelimiterFilterFactory"
>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"></filter>
>>         <filter class="solr.LowerCaseFilterFactory"></filter>
>>         <filter class="solr.EnglishPorterFilterFactory"
>> protected="protwords.txt"></filter>
>>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"></filter>
>>       </analyzer>
>>     </fieldType>
>> 
>> Would appreciate your input to get absolute terms??
>> 
>> 3- For each term, extract documents containing those term using termDocs()
>> method
>> 4- Create one more index with fields, term, frequency and docNo. This index
>> would be used for autocomplete feature.
>> 5- Any letter typed by user in search field, use Ajax script (like
>> scriptaculous or JQuery) to extract all terms using prefix query.
>> 6- Based on search term selected by user, keep track of document nos in
>> which this term belongs.
>> 7- For next search term selection using documents nos to select all terms
>> excluding currently selected term.
>> 
>> This somehow works. As new to SOlr ans also to Lucene, I would like to know
>> in case it can be improved?
>> 
>> - RB
>> 
>>  
> 
> 
>       
> ______________________________________________________________________________
> ______
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


Mime
View raw message