lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankit Murarka <ankit.mura...@rancoretech.com>
Subject Re: Did you Mean search on Indexes created by Different Files.
Date Wed, 31 Jul 2013 05:45:37 GMT
Any help on this will be highly appreciated..I have been trying all 
possible different option but to no avail.

Also tried LuceneDictionary BUT THIS ALSO DOES NOT SEEM TO BE HELPING...

Please guide.

On 7/30/2013 4:49 PM, Ankit Murarka wrote:
> Hello.
>
> Using DirectSpellChecker is not serving my purpose. This seems to 
> return word suggestions from a dictionary whereas I wish to return 
> search suggestion from Indexes I created supplying my own Files (These 
> files are generally log files).
>
> I created indexes for certain files in D:\\Indexes and the field name 
> is "content"
>
> Then I used DirectSpellChecker and provided IndexReader argument to 
> it. Invoked SuggestSimilar function and SuggestWords array as the 
> output. Iterated over the array .
>
> I seem to get suggested words from the dictionary and not from the 
> indexes.
>
> Code Snippet for the search part:
>
> String index="D:\\Indexes";
> String field = "contents";
> IndexReader reader = DirectoryReader.open(FSDirectory.open(new 
> File(index)));
> DirectSpellChecker dsc=new DirectSpellChecker();
> Term term1=new Term(field, "Amrih");
> SuggestWord[] suggestWord=dsc.suggestSimilar(term1, 10, reader);
> if(suggestWord!=null && suggestWord.length>0)
>         {
>             for(SuggestWord word:suggestWord)
>             {
>                 System.out.println("Did you Mean  "  + word.string );
>             }
>
>         }
>         else
>         {
>             System.out.println("No Suggestions found");
>         }
>
>
> Please guide. Basically the suggested words should be provided from 
> the indexes I have created.. It should not come from any dictionary.. 
> Is it possible ?
>
>
> On 7/29/2013 9:34 PM, Varun Thacker wrote:
>> Hi,
>>
>>
>> On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka<
>> ankit.murarka@rancoretech.com>  wrote:
>>
>>> Since I am new to this, I can't stop exploring it and trying to use
>>> different features.
>>>
>>> I am now trying to implement "Did you Mean " search using 
>>> SpellChecker jar
>>> and Lucene jar.
>>>
>>> The problem I faced are plenty although I have got it working..
>>>
>>> code snippet:
>>>
>>> File dir = new File("D:\\Inde\\");
>>> Directory directory = FSDirectory.open(dir);
>>> SpellChecker spellChecker = new SpellChecker(directory);
>>> String wordForSuggestions = "aski";
>>> Analyzer analyzer=new 
>>> CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>>>   //This analyzer only has commented LowerCaseFilter.
>>> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
>>> analyzer);
>>> IndexWriter writer = new IndexWriter(directory, iwc);
>>> File file1=new File("D:\\Inde\\wordlist.txt")**;
>>> indexDocs(writer,file1);
>>> writer.close();
>>> spellChecker.indexDictionary(
>>> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
>>> false);
>>> int suggestionsNumber = 10;
>>> String[] suggestions = spellChecker.
>>> suggestSimilar(**wordForSuggestions, suggestionsNumber);
>>> if (suggestions!=null&&  suggestions.length>0) {
>>>
>>>              for (String word : suggestions) {
>>>
>>>                  System.out.println("Did you mean:" + word + "");
>>>
>>>              }
>>>
>>>          }
>>> else {
>>>
>>>              System.out.println("No suggestions found for
>>> word:"+wordForSuggestions);
>>>
>>>          }
>>>
>>> The code works fine. It suggest me 10 possible matches.
>>> Problem is here I am creating/updating indexes everytime.
>>>
>>> Say suppose I have 1000 log files and these files are indexed in
>>> D:\\LogIndexes. Instead of reading a standard dictionary and 
>>> building up
>>> indexes, I wish to use these indexes to suggest me possible match..
>>>
>>> Is it possible to do?. If yes, what can be the approach. Please provide
>>> some assistance.
>>>
>>
>>   Check out DirectSpellChecker (
>> http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html

>>
>>   )
>>
>> Using DirectSpellChecker you do not need to build a separate spell 
>> index,
>> instead using the actual index for spell suggestions.
>>
>>
>>
>>> Next question would be to suggest a phrase. If I enter "Head ach 
>>> heav" ,
>>> then I should get "Head ache heavy" as one possible suggestion. haven't
>>> tried it yet but surely will be an absolute beauty to have it..
>>>
>> DirectSpellChecker works on a term so there is no feature which will 
>> give
>> you suggestions on a phrase out of the box.
>>
>> You might want to take each term of the query and check for spell 
>> mistakes,
>> and then combine them back again. You could look up the code from
>> Solr.SpellCheckComponent.addCollationsToResponse
>>
>> http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate
>>
>> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java

>>
>>
>>
>>
>>
>>> Also examples available on net for "Did you mean" are very very old and
>>> API have undergone significant changes thus making them not so very 
>>> useful.
>>>
>>>
>>> -- 
>>> Regards
>>>
>>> Ankit Murarka
>>>
>>> "Peace is found not in what surrounds us, but in what we hold within."
>>>
>>>
>>> ------------------------------**------------------------------**--------- 
>>>
>>> To unsubscribe, e-mail: 
>>> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>

>>>
>>> For additional commands, e-mail: 
>>> java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>>>
>>>
>>
>
>


-- 
Regards

Ankit Murarka

"Peace is found not in what surrounds us, but in what we hold within."


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message