lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Thacker <varunthacker1...@gmail.com>
Subject Re: Did you Mean search on Indexes created by Different Files.
Date Mon, 29 Jul 2013 16:04:42 GMT
Hi,


On Mon, Jul 29, 2013 at 4:36 PM, Ankit Murarka <
ankit.murarka@rancoretech.com> wrote:

> Since I am new to this, I can't stop exploring it and trying to use
> different features.
>
> I am now trying to implement "Did you Mean " search using SpellChecker jar
> and Lucene jar.
>
> The problem I faced are plenty although I have got it working..
>
> code snippet:
>
> File dir = new File("D:\\Inde\\");
> Directory directory = FSDirectory.open(dir);
> SpellChecker spellChecker = new SpellChecker(directory);
> String wordForSuggestions = "aski";
> Analyzer analyzer=new CustomAnalyzerForCaseSensitive**(Version.LUCENE_43);
>  //This analyzer only has commented LowerCaseFilter.
> IndexWriterConfig iwc = new IndexWriterConfig(Version.**LUCENE_43,
> analyzer);
> IndexWriter writer = new IndexWriter(directory, iwc);
> File file1=new File("D:\\Inde\\wordlist.txt")**;
> indexDocs(writer,file1);
> writer.close();
> spellChecker.indexDictionary(
> new PlainTextDictionary(new File("D:\\Inde\\wordlist.txt")**), iwc,
> false);
> int suggestionsNumber = 10;
> String[] suggestions = spellChecker.
> suggestSimilar(**wordForSuggestions, suggestionsNumber);
> if (suggestions!=null && suggestions.length>0) {
>
>             for (String word : suggestions) {
>
>                 System.out.println("Did you mean:" + word + "");
>
>             }
>
>         }
> else {
>
>             System.out.println("No suggestions found for
> word:"+wordForSuggestions);
>
>         }
>
> The code works fine. It suggest me 10 possible matches.
> Problem is here I am creating/updating indexes everytime.
>
> Say suppose I have 1000 log files and these files are indexed in
> D:\\LogIndexes. Instead of reading a standard dictionary and building up
> indexes, I wish to use these indexes to suggest me possible match..
>
> Is it possible to do?. If yes, what can be the approach. Please provide
> some assistance.
>


 Check out DirectSpellChecker (
http://lucene.apache.org/core/4_3_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html
 )

Using DirectSpellChecker you do not need to build a separate spell index,
instead using the actual index for spell suggestions.



> Next question would be to suggest a phrase. If I enter "Head ach heav" ,
> then I should get "Head ache heavy" as one possible suggestion. haven't
> tried it yet but surely will be an absolute beauty to have it..
>

DirectSpellChecker works on a term so there is no feature which will give
you suggestions on a phrase out of the box.

You might want to take each term of the query and check for spell mistakes,
and then combine them back again. You could look up the code from
Solr.SpellCheckComponent.addCollationsToResponse

http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate

http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java




>
> Also examples available on net for "Did you mean" are very very old and
> API have undergone significant changes thus making them not so very useful.
>
>
> --
> Regards
>
> Ankit Murarka
>
> "Peace is found not in what surrounds us, but in what we hold within."
>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>
>


-- 


Regards,
Varun Thacker
http://www.vthacker.in/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message