lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From climbingrose <climbingr...@gmail.com>
Subject Re: Spell Check Handler
Date Mon, 09 Jul 2007 10:20:42 GMT
Thanks for the quick reply. However, I'm still not able to setup
spellchecker. Solr does create spell directory under data but doesn't seem
to build the spellchecker index. Here are snippets of my schema.xml:

<field name="title" type="string" indexed="true" stored="true"/>

<requestHandler name="spellchecker" class="solr.SpellCheckerRequestHandler"
startup="lazy">
    <!-- default values for query parameters -->
     <lst name="defaults">
       <int name="suggestionCount">1</int>
       <float name="accuracy">0.5</float>
     </lst>

     <!-- Main init params for handler -->

     <!-- The directory where your SpellChecker Index should live.   -->
     <!-- May be absolute, or relative to the Solr "dataDir" directory. -->
     <!-- If this option is not specified, a RAM directory will be used -->
     <str name="spellcheckerIndexDir">spell</str>

     <!-- the field in your schema that you want to be able to build -->
     <!-- your spell index on. This should be a field that uses a very -->
     <!-- simple FieldType without a lot of Analysis (ie: string) -->
     <str name="termSourceField">title</str>

   </requestHandler>

I tried this url:
http://localhost:8984/solr/select/?q=Accountent&qt=spellchecker&cmd=rebuildand
receive this:

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2</int>
</lst>
<str name="cmdExecuted">rebuild</str>
<arr name="suggestions"/>
</response>


On 7/9/07, Tristan Vittorio <tristan.vittorio@gmail.com> wrote:
>
> The spellchecker should be available in 1.2 release, your query is
> incorrect, try the following:
>
>
> http://localhost:8984/solr/select/?q=java&qt=spellchecker&termSourceField=title_text&cmd=rebuild
>
> the 'q' parameter must only contain the word being checked; you must
> specify
> the field separately.  You can set "termSourceField" in your
> solrconfig.xmlfile so you do not need to explicitly set it each time
> you want to run a
> spell check query. Also make sure your field isn't heavily processed (i.e.
> with porter stemmer analyzers) otherwise the suggestions will look a bit
> weird / mangled.  Take a look at the wiki page for more info:
>
> http://wiki.apache.org/solr/SpellCheckerRequestHandler
>
> cheers,
> Tristan
>
>
>
> On 7/9/07, climbingrose <climbingrose@gmail.com> wrote:
> >
> > Hi Tristan,
> >
> > Is this spellchecker available in 1.2 release or I have to build the
> > trunk.
> > I tried your instructions but Solr returns nothing:
> >
> >
> >
> http://localhost:8984/solr/select/?q=title_text:java&qt=spellchecker&cmd=rebuild
> >
> > Result:
> >
> > <response>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">3</int>
> > </lst>
> > <str name="cmdExecuted">rebuild</str>
> > <arr name="suggestions"/>
> > </response>
> >
> > Thanks.
> >
> >
> > On 7/8/07, Tristan Vittorio <tristan.vittorio@gmail.com> wrote:
> > >
> > > Hi Otis,
> > >
> > > I have written a draft wiki entry for the spell checker:
> > > http://wiki.apache.org/solr/SpellCheckerRequestHandler
> > >
> > > I've learned that my initial observation about the suggestion ordering
> > was
> > > incorrect, it does in fact order the results by popularity (or term
> > > frequency) of the word in the termSourceField, the problem I
> experienced
> > > was
> > > caused by setting termSourceField to a field of type "text", which
> > heavily
> > > stemmed and analyzed the words.  I found that using the
> > StandardTokenizer
> > > and StandardFilter and removing the PorterStemmer and LowerCaseFilter
> > from
> > > the field schema really improved the spell checker performance.
> > >
> > > I haven't included this info on the wiki page yet, I'll try to update
> it
> > > soon when I have a bit more time.
> > >
> > > cheers,
> > > Tristan
> > >
> > >
> > >
> > > On 7/8/07, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> > > >
> > > > Tristan - good summary - want to copy that to the Solr Wiki?
> > > >
> > > > Thanks,
> > > > Otis
> > > >
> > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> > > > Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share
> > > >
> > > > ----- Original Message ----
> > > > From: Tristan Vittorio <tristan.vittorio@gmail.com>
> > > > To: solr-user@lucene.apache.org
> > > > Sent: Saturday, July 7, 2007 1:51:15 AM
> > > > Subject: Re: Spell Check Handler
> > > >
> > > > I couldn't find any documention on the spell check handler either
> but
> > > > found
> > > > enough information from the solrconfig.xml file, simply search for
> > > > "SpellCheckerRequestHandler" (online version here):
> > > >
> > > >
> > >
> >
> http://svn.apache.org/repos/asf/lucene/solr/trunk/example/solr/conf/solrconfig.xml
> > > >
> > > > You can view the original development discussion from JIRA (not sure
> > how
> > > > helpful that will be for you though):
> > > > https://issues.apache.org/jira/browse/SOLR-81
> > > >
> > > > In a nutshell, the configuration parameters available are::
> > > >
> > > > suggestionCount: determines how many spelling suggestions are
> > returned.
> > > > accuracy: a float value between 1.0 and 0.0 on how close the
> suggested
> > > > words
> > > > should match the original word being checked.
> > > > spellcheckerIndexDir and  termSourceField: check solrconfig.xml for
> a
> > > full
> > > > explanation.
> > > >
> > > > In order to use the spell checking hander for the first time, you
> need
> > > to
> > > > explicitly build the spelling index with a sample query something
> like
> > > > this:
> > > >
> > > >
> > >
> >
> http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker&cmd=rebuild
> > > > <http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker>
> > > > Depending on how large you main index is, this rebuild operation
> could
> > > > take
> > > > a while.  Subsequent queries can omit '&cmd=rebuild' and will return
> > > > results
> > > > much faster:
> > > >
> > > > http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker
> > > > <http://localhost:8080/solr/select/?q=macrosoft&qt=spellchecker>
> > > > The order of the suggestions returned seems to be based on the
> > accuracy
> > > > figure (i.e. how close it matches the original word). it would be
> > great
> > > to
> > > > be able to sort these suggested results based on term frequency /
> > > document
> > > > frequency of the suggested word in the main index, since the most
> > > accurate
> > > > suggestion may not always be the most relevant.
> > > >
> > > > As far as I can tell there is currently no way of doing this using
> the
> > > > spellchecker handler alone (you could always run seperate standard
> > > queries
> > > > on each word suggestion and order by numDocs, but that would be very
> > > > inefficient), has anybody else tried to achieve this?
> > > >
> > > > cheers,
> > > > Tristan
> > > >
> > > >
> > > >
> > > > On 7/7/07, Andrew Nagy <andrew.nagy@villanova.edu > wrote:
> > > > >
> > > > > Hello, is there any documentation on how to use the new spell
> check
> > > > > module?
> > > > >
> > > > > Thanks
> > > > > Andrew
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Cuong Hoang
> >
>



-- 
Regards,

Cuong Hoang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message