lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amin Mohammed-Coleman <ami...@gmail.com>
Subject Re: SpellChecker in use with composite query
Date Wed, 15 Apr 2009 06:58:24 GMT
Hi

Apologies for bringing this mail up again. But I have resolved some of the
issues that I originally started with including composite queries.  However
I just have 1 remaining question which I would be grateful if someone could
assist me with.

I have a class whcih performs the creation of the spell index but I'm not
sure where to apply this class.   Do I apply this process whenever a user
uploads a new file (kicking off the indexing process).  It seems as though
this may not be the most appropriate place as I have one spell index and 4
document indexes.  I'm wondering what the general approach is.  Also
whenever the indexes change should I clear the spell index and start again?


Once again apologies for bringing this up.


Cheers
Amin

On Sat, Apr 11, 2009 at 9:59 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:

> Hi
> Another thing that I was wondering is how to apply the construction of the
> spell index.  Where is the most appropriate place to create the spell index?
>
>
> For example:
>
> IndexReader spellReader = IndexReader.open(fsDirectory1);
>
> IndexReader spellReader2 = IndexReader.open(fsDirectory2);
>
> MultiReader multiReader = new MultiReader(new IndexReader[]
> {spellReader,spellReader2});
>
> LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
> "content");
>
>  Directory spellDirectory = FSDirectory.getDirectory(<single index for
> spellcheck);
>
> SpellChecker spellChecker = new SpellChecker(spellDirectory);
>
> spellChecker.indexDictionary(luceneDictionary);
>
>
> should this be applied when doing a search or when a document is indexed?
> Should I clear the spellIndex when the main index changes?
>
>
> I also noticed that when running some tests I found that the spell index
> contained numbers from the text extracted from a document.  Is there a way
> to only include a*lphabetic characters in the indexDictionary process?*
>
>
>
> Any help would be appreciated.
>
>
> Cheers
>
> On Fri, Apr 10, 2009 at 2:28 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:
>
>> Hi
>> I have been playing around with the SpellChecker class and so far it looks
>> really good.  While developing a testcase to show it working I came across a
>> couple of issues which I have resolved but I'm not certain if this is the
>> correct approach.  I would therefore be grateful if anyone could tell me
>> whether it is correct or I should try something else.
>>
>> 1) Multple Indexes:
>> I have multiple indexes which store different documents based on certain
>> subject matter.  So inorder to perform the spellchecking against all indexes
>> I did something like this:
>>
>> IndexReader spellReader = IndexReader.open(fsDirectory1);
>>
>> IndexReader spellReader2 = IndexReader.open(fsDirectory2);
>>
>> MultiReader multiReader = new MultiReader(new IndexReader[]
>> {spellReader,spellReader2});
>>
>> LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
>> "content");
>>
>>  Directory spellDirectory = FSDirectory.getDirectory(<single index for
>> spellcheck);
>>
>> SpellChecker spellChecker = new SpellChecker(spellDirectory);
>>
>> spellChecker.indexDictionary(luceneDictionary);
>>
>>
>> Is this an acceptable approach or should there be a spellcheck index for
>> each seperate document index?
>>
>>
>>
>>  2) Composite query e.g. Luciene OR doqument
>>
>> Inorder to handle the above i did the following:
>>
>>
>> QueryParser queryParser = new AnalyzingQueryParser("content",analyzer);
>>
>> String input = "luciene OR doqument";
>>
>> Query query = queryParser.parse(input);
>>
>> String input2 = query.toString("content");
>>
>> String[] splitString = input2.split(" ");
>>
>>
>> For each of the string in the array i performed the suggestSimilar(..).
>>
>>
>> Is this the most appropriate way of doing this?
>>
>>
>>
>>  Any help would be appreciated.
>>
>>
>> Cheers
>>
>> Amin
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message