lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amin Mohammed-Coleman <ami...@gmail.com>
Subject Re: SpellChecker in use with composite query
Date Sat, 11 Apr 2009 20:59:57 GMT
Hi
Another thing that I was wondering is how to apply the construction of the
spell index.  Where is the most appropriate place to create the spell index?


For example:

IndexReader spellReader = IndexReader.open(fsDirectory1);

IndexReader spellReader2 = IndexReader.open(fsDirectory2);

MultiReader multiReader = new MultiReader(new IndexReader[]
{spellReader,spellReader2});

LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
"content");

Directory spellDirectory = FSDirectory.getDirectory(<single index for
spellcheck);

SpellChecker spellChecker = new SpellChecker(spellDirectory);

spellChecker.indexDictionary(luceneDictionary);


should this be applied when doing a search or when a document is indexed?
Should I clear the spellIndex when the main index changes?


I also noticed that when running some tests I found that the spell index
contained numbers from the text extracted from a document.  Is there a way
to only include a*lphabetic characters in the indexDictionary process?*



Any help would be appreciated.


Cheers

On Fri, Apr 10, 2009 at 2:28 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:

> Hi
> I have been playing around with the SpellChecker class and so far it looks
> really good.  While developing a testcase to show it working I came across a
> couple of issues which I have resolved but I'm not certain if this is the
> correct approach.  I would therefore be grateful if anyone could tell me
> whether it is correct or I should try something else.
>
> 1) Multple Indexes:
> I have multiple indexes which store different documents based on certain
> subject matter.  So inorder to perform the spellchecking against all indexes
> I did something like this:
>
> IndexReader spellReader = IndexReader.open(fsDirectory1);
>
> IndexReader spellReader2 = IndexReader.open(fsDirectory2);
>
> MultiReader multiReader = new MultiReader(new IndexReader[]
> {spellReader,spellReader2});
>
> LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
> "content");
>
>  Directory spellDirectory = FSDirectory.getDirectory(<single index for
> spellcheck);
>
> SpellChecker spellChecker = new SpellChecker(spellDirectory);
>
> spellChecker.indexDictionary(luceneDictionary);
>
>
> Is this an acceptable approach or should there be a spellcheck index for
> each seperate document index?
>
>
>
> 2) Composite query e.g. Luciene OR doqument
>
> Inorder to handle the above i did the following:
>
>
> QueryParser queryParser = new AnalyzingQueryParser("content",analyzer);
>
> String input = "luciene OR doqument";
>
> Query query = queryParser.parse(input);
>
> String input2 = query.toString("content");
>
> String[] splitString = input2.split(" ");
>
>
> For each of the string in the array i performed the suggestSimilar(..).
>
>
> Is this the most appropriate way of doing this?
>
>
>
> Any help would be appreciated.
>
>
> Cheers
>
> Amin
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message