Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (athena.apache.org: domain of aminmc@gmail.com designates
 209.85.220.158 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=oWysYJZ8gVMkctmAri4nh/e3T7s2mFzOEbiUSEwkOhw6Tt8FYWz88KwmIk/wk1EODJ
         unkpSajqUgQFL6zTOt4eVDccDdwmU1KLJQTflKtS3F6SSGstVfqgdq8h8ecKzlPYv2SV
         gC4/4oI2D8SA4rwWlb/+NwYkFBpEIZqc3LCcU=
MIME-Version: 1.0
In-Reply-To: <6f4104d80904111359p5621561bw2939160db9ba862b@mail.gmail.com>
References: <6f4104d80904100628q79150b4oa3e5436cfb66c581@mail.gmail.com>
	 <6f4104d80904111359p5621561bw2939160db9ba862b@mail.gmail.com>
Date: Wed, 15 Apr 2009 07:58:24 +0100
Message-ID: <6f4104d80904142358v7854b805l985abe131bf765c2@mail.gmail.com>
Subject: Re: SpellChecker in use with composite query
From: Amin Mohammed-Coleman <aminmc@gmail.com>
To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>
Content-Type: multipart/alternative; boundary=0016e6dee7444c9dd90467927912

--0016e6dee7444c9dd90467927912
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Hi

Apologies for bringing this mail up again. But I have resolved some of the
issues that I originally started with including composite queries.  However
I just have 1 remaining question which I would be grateful if someone could
assist me with.

I have a class whcih performs the creation of the spell index but I'm not
sure where to apply this class.   Do I apply this process whenever a user
uploads a new file (kicking off the indexing process).  It seems as though
this may not be the most appropriate place as I have one spell index and 4
document indexes.  I'm wondering what the general approach is.  Also
whenever the indexes change should I clear the spell index and start again?


Once again apologies for bringing this up.


Cheers
Amin

On Sat, Apr 11, 2009 at 9:59 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:

> Hi
> Another thing that I was wondering is how to apply the construction of the
> spell index.  Where is the most appropriate place to create the spell index?
>
>
> For example:
>
> IndexReader spellReader = IndexReader.open(fsDirectory1);
>
> IndexReader spellReader2 = IndexReader.open(fsDirectory2);
>
> MultiReader multiReader = new MultiReader(new IndexReader[]
> {spellReader,spellReader2});
>
> LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
> "content");
>
>  Directory spellDirectory = FSDirectory.getDirectory(<single index for
> spellcheck);
>
> SpellChecker spellChecker = new SpellChecker(spellDirectory);
>
> spellChecker.indexDictionary(luceneDictionary);
>
>
> should this be applied when doing a search or when a document is indexed?
> Should I clear the spellIndex when the main index changes?
>
>
> I also noticed that when running some tests I found that the spell index
> contained numbers from the text extracted from a document.  Is there a way
> to only include a*lphabetic characters in the indexDictionary process?*
>
>
>
> Any help would be appreciated.
>
>
> Cheers
>
> On Fri, Apr 10, 2009 at 2:28 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:
>
>> Hi
>> I have been playing around with the SpellChecker class and so far it looks
>> really good.  While developing a testcase to show it working I came across a
>> couple of issues which I have resolved but I'm not certain if this is the
>> correct approach.  I would therefore be grateful if anyone could tell me
>> whether it is correct or I should try something else.
>>
>> 1) Multple Indexes:
>> I have multiple indexes which store different documents based on certain
>> subject matter.  So inorder to perform the spellchecking against all indexes
>> I did something like this:
>>
>> IndexReader spellReader = IndexReader.open(fsDirectory1);
>>
>> IndexReader spellReader2 = IndexReader.open(fsDirectory2);
>>
>> MultiReader multiReader = new MultiReader(new IndexReader[]
>> {spellReader,spellReader2});
>>
>> LuceneDictionary luceneDictionary = new LuceneDictionary(multiReader,
>> "content");
>>
>>  Directory spellDirectory = FSDirectory.getDirectory(<single index for
>> spellcheck);
>>
>> SpellChecker spellChecker = new SpellChecker(spellDirectory);
>>
>> spellChecker.indexDictionary(luceneDictionary);
>>
>>
>> Is this an acceptable approach or should there be a spellcheck index for
>> each seperate document index?
>>
>>
>>
>>  2) Composite query e.g. Luciene OR doqument
>>
>> Inorder to handle the above i did the following:
>>
>>
>> QueryParser queryParser = new AnalyzingQueryParser("content",analyzer);
>>
>> String input = "luciene OR doqument";
>>
>> Query query = queryParser.parse(input);
>>
>> String input2 = query.toString("content");
>>
>> String[] splitString = input2.split(" ");
>>
>>
>> For each of the string in the array i performed the suggestSimilar(..).
>>
>>
>> Is this the most appropriate way of doing this?
>>
>>
>>
>>  Any help would be appreciated.
>>
>>
>> Cheers
>>
>> Amin
>>
>>
>

--0016e6dee7444c9dd90467927912--