lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Murarka <ankit.mura...@rancoretech.com>
Subject Re: Did you Mean search on Indexes created by Different Files.
Date Tue, 30 Jul 2013 11:19:58 GMT
Hello.

Using DirectSpellChecker is not serving my purpose. This seems to return 
word suggestions from a dictionary whereas I wish to return search 
suggestion from Indexes I created supplying my own Files (These files 
are generally log files).

I created indexes for certain files in D:\\Indexes and the field name is 
"content"

Then I used DirectSpellChecker and provided IndexReader argument to it. 
Invoked SuggestSimilar function and SuggestWords array as the output. 
Iterated over the array .

I seem to get suggested words from the dictionary and not from the indexes.

Code Snippet for the search part:

String index="D:\\Indexes";
String field = "contents";
IndexReader reader = DirectoryReader.open(FSDirectory.open(new 
File(index)));
DirectSpellChecker dsc=new DirectSpellChecker();
Term term1=new Term(field, "Amrih");
SuggestWord[] suggestWord=dsc.suggestSimilar(term1, 10, reader);
if(suggestWord!=null && suggestWord.length>0)
         {
             for(SuggestWord word:suggestWord)
             {
                 System.out.println("Did you Mean  "  + word.string );
             }

         }
         else
         {
             System.out.println("No Suggestions found");
         }


Please guide. Basically the suggested words should be provided from the 
indexes I have created.. It should not come from any dictionary.. Is it 
possible ?


On 7/29/2013 9:34 PM, Varun Thacker wrote:
> Hi,
>
>
> On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka<
> ankit.murarka@rancoretech.com>  wrote:
>
>    
>> Since I am new to this, I can't stop exploring it and trying to use
>> different features.
>>
>> I am now trying to implement "Did you Mean " search using SpellChecker jar
>> and Lucene jar.
>>
>> The problem I faced are plenty although I have got it working..
>>
>> code snippet:
>>
>> File dir = new File("D:\\Inde\\");
>> Directory directory = FSDirectory.open(dir);
>> SpellChecker spellChecker = new SpellChecker(directory);
>> String wordForSuggestions = "aski";
>> Analyzer analyzer=new CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>>   //This analyzer only has commented LowerCaseFilter.
>> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
>> analyzer);
>> IndexWriter writer = new IndexWriter(directory, iwc);
>> File file1=new File("D:\\Inde\\wordlist.txt")**;
>> indexDocs(writer,file1);
>> writer.close();
>> spellChecker.indexDictionary(
>> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
>> false);
>> int suggestionsNumber = 10;
>> String[] suggestions = spellChecker.
>> suggestSimilar(**wordForSuggestions, suggestionsNumber);
>> if (suggestions!=null&&  suggestions.length>0) {
>>
>>              for (String word : suggestions) {
>>
>>                  System.out.println("Did you mean:" + word + "");
>>
>>              }
>>
>>          }
>> else {
>>
>>              System.out.println("No suggestions found for
>> word:"+wordForSuggestions);
>>
>>          }
>>
>> The code works fine. It suggest me 10 possible matches.
>> Problem is here I am creating/updating indexes everytime.
>>
>> Say suppose I have 1000 log files and these files are indexed in
>> D:\\LogIndexes. Instead of reading a standard dictionary and building up
>> indexes, I wish to use these indexes to suggest me possible match..
>>
>> Is it possible to do?. If yes, what can be the approach. Please provide
>> some assistance.
>>
>>      
>
>   Check out DirectSpellChecker (
> http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html
>   )
>
> Using DirectSpellChecker you do not need to build a separate spell index,
> instead using the actual index for spell suggestions.
>
>
>
>    
>> Next question would be to suggest a phrase. If I enter "Head ach heav" ,
>> then I should get "Head ache heavy" as one possible suggestion. haven't
>> tried it yet but surely will be an absolute beauty to have it..
>>
>>      
> DirectSpellChecker works on a term so there is no feature which will give
> you suggestions on a phrase out of the box.
>
> You might want to take each term of the query and check for spell mistakes,
> and then combine them back again. You could look up the code from
> Solr.SpellCheckComponent.addCollationsToResponse
>
> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate
>
> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java
>
>
>
>
>    
>> Also examples available on net for "Did you mean" are very very old and
>> API have undergone significant changes thus making them not so very useful.
>>
>>
>> --
>> Regards
>>
>> Ankit Murarka
>>
>> "Peace is found not in what surrounds us, but in what we hold within."
>>
>>
>> ------------------------------**------------------------------**---------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
>> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>>
>>
>>      
>
>    


-- 
Regards

Ankit Murarka

"Peace is found not in what surrounds us, but in what we hold within."


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message