lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: File-based Spelling
Date Tue, 13 Oct 2015 01:28:51 GMT
Let's see your solrconfig entries? Doubtless something innocent
seeming isn't quite right.

This might provide some clues:
http://lucidworks.com/blog/2015/03/04/solr-suggester/

The reference guide is the first place to look, a lot of this
functionality has changed
in recent years so I always try to use the Solr reference guide:
https://cwiki.apache.org/confluence/display/solr/Spell+Checking

Best,
Erick

On Mon, Oct 12, 2015 at 12:37 PM, Mark Fenbers <mark.fenbers@noaa.gov> wrote:
> Greetings!
>
> I'm attempting to use a file-based spell checker.  My sourceLocation is
> /usr/share/dict/linux.words, and my spellcheckIndexDir is set to
> ./data/spFile.  BuildOnStartup is set to true, and I see nothing to suggest
> any sort of problem/error in solr.log.  However, in my ./data/spFile/
> directory, there are only two files: segments_2 with only 71 bytes in it,
> and a zero-byte write.lock file.  For a source dictionary having 480,000
> words in it, I was expecting a bit more substance in the ./data/spFile
> directory.  Something doesn't seem right with this.
>
> Moreover, I ran a query on the word Fenbers, which isn't listed in the
> linux.words file, but there are several similar words.  The results I got
> back were odd, and suggestions included the following:
> fenber
> f en be r
> f e nb er
> f en b er
> f e n be r
> f en b e r
> f e nb e r
> f e n b er
> f e n b e r
>
> But I expected suggestions like fenders, embers, and fenberry, etc. I also
> ran a query on Mark (which IS listed in linux.words) and got back two
> suggestions in a similar format.  I played with configurables like changing
> the fieldType from text_en to string and the characterEncoding from UTF-8 to
> ASCII, etc., but nothing seemed to yield any different results.
>
> Can anyone offer suggestions as to what I'm doing wrong?  I've been
> struggling with this for more than 40 hours now!  I'm surprised my
> persistence has lasted this long!
>
> Thanks,
> Mark

Mime
View raw message