lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anuvenk <anuvenkat...@hotmail.com>
Subject Re: spellcheckhandler
Date Sun, 27 Jan 2008 01:51:07 GMT

Thanks a lot for clearing my doubts. Would you know if the solr wiki is up to
date with the documentation for the new features that are being added? I
totally rely on the solr wiki documentation for my project. If you may,
please send me the files you had mentioned and i'll be happy to test them. I
appreciate your help !!

scott.tabar wrote:
> 
> Anuvenk,
> 
> Sorry for this "Third" email, but I was reading your question below and I
> think it warrants yet another reply.
> 
> Just some background from my focus and involvement, and hence the
> generation of the JavaDocs.  I was primarily interested in having a Solr
> based spell checker that behaved more like a traditional spell checker. 
> In my application, when I generated the input in to Solr for inclusion of
> the spell checker indexer, I was only interested in single words and not
> multi-word sets.  My intentions was to send multiple words to the handler
> and have it return details on each word as it stands independently when
> the parameter multiWords was set, otherwise it was to use all input words
> as a single check against the handler.  As such, in my original efforts, I
> had no multiple words in a single term, as you were asking below.  That is
> not to say it is not possible, but I just wanted to let you know the
> original focus of my work.
> 
> I did look a little closer at the JavaDocs and it looks like they have
> been updated from what I originally generated.  So perhaps they may be up
> to date?
> 
> One thing I would like to point out, is that I put some efforts in
> creating a test case for the SpellCheckerRequestHandler.  If it still
> exists (I have not checked the head for a long time) then it would be a
> good starting point to do some simple testing with limited data sets of
> your own.  Just make a copy of it, and then feed in multi-word terms and
> see how it responds do the different settings.  This will also allow you
> to play around with the configuration settings in the schema and
> solrconfig files without impacting your actual Solr instance and the turn
> around time could be in the seconds and not minutes with each alteration
> of a new test.  
> 
> The locations in svn and file names of the unit tests that I created were:
>   /test/test-files/solr/conf/schema-spellchecker.xml
>   /test/test-files/solr/conf/solrconfig-spellchecker.xml
>   /test/org/apache/solr/handler/SpellCheckerRequestHandlerTest.java
> 
> If these do not existing in svn currently, let me know and I can pass
> along the contents and you can recreate them locally to test with.
> 
>   Best of luck,
>     Scott Tabar
> 
> ---- anuvenk <anuvenkatesh@hotmail.com> wrote: 
> 
> Thanks. But i'm looking at this
> http://.../spellchecker?indent=on&onlyMorePopular=true&accuracy=.6&suggestionCount=20&q=facial+salophosphoprotein
> on
> http://lucene.apache.org/solr/api/org/apache/solr/handler/SpellCheckerRequestHandler.html
> It seems to return results (well in the example) 
> with and without extendedResults=true
> does it mean that 'facial salophosphoprotein' was a single term in the
> index. 
> 
> 
> hossman wrote:
>> 
>> : 
>> : I did try with the latest nightly build and followed the steps outlined
>> in
>> : http://wiki.apache.org/solr/SpellCheckerRequestHandler
>> : with regards to creating new catchall field 'spell' of type 'spell' and
>> : copied my text fields to 'spell' at index time.
>> : Still q=grapics returns 'graphics'
>> : but q=grapics card returns nothing.
>> : But the same queries return the correct spelling with string
>> fieldtypes.
>> : Any fix available? 
>> 
>> I don't think Otis was suggesting any specific fix was available in the 
>> nightly builds, i believe he was just addressing specificly that if there 
>> was a bug someone commited a fix for you didnt' need to wait for 1.3 -- 
>> you can test it now using the nightly builds.
>> 
>> That said: I don't see any currently open or recent resolved bugs 
>> related to spellchecking and multiple words ... i believe (but i'm not 
>> 100% positive) that "multi word" spell correction will work, as long as 
>> your dictionary contaisn those "multiple words" as individual "terms"
>> 
>> ie: if you want "graphics card" to be a suggestion for "grapics card"
>> then 
>> you need to use a termSourceField in which "graphics card" is a single 
>> term (either because it is untokenized, or maybe because you use a 
>> word-based ngram tokenfilter, etc...)
>> 
>> alternately, if you want to get "graphics asdfghjk" as a suggestion for
>> "grapics asdfghjk" (even though "asdfghjk" isn't in your index at all), 
>> hiting the spellcorrection handler for each input word individually is 
>> probably your best bet.
>> 
>> 
>> : > You don't need to wait for 1.3 to be released - you can simply use a
>> : > recent nightly build.
>> 
>> 
>> -Hoss
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/spellcheckhandler-tp14627712p15100704.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/spellcheckhandler-tp14627712p15115105.html
Sent from the Solr - User mailing list archive at Nabble.com.


Mime
View raw message