lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lochschmied, Alexander" <>
Subject AW: Spellchecking and suggesting part numbers
Date Mon, 03 Nov 2014 08:36:42 GMT
Thanks James, this did help a lot.

Is it possible to make DirectSolrSpellChecker try to return suggestions with maximum length
of matching leading characters?


-----Urspr√ľngliche Nachricht-----
Von: Dyer, James [] 
Gesendet: Mittwoch, 24. September 2014 16:42
Betreff: RE: Spellchecking and suggesting part numbers


You could use a higher value for spellcheck.count, maybe 20 or so, then in your application
pick out the suggestions that make changes on the right side.

Another option is to use DirectSolrSpellChecker (usually a better choice anyhow) and set the
"minPrefix" field.  This will require up to n characters on the left side to match before
it will make suggestions.  Taking a quick look at the code, it seems to me it won't try and
correct anything in this prefix region also.  So perhaps you can set this to 2-4 (default=1).

James Dyer
Ingram Content Group
(615) 213-4311

-----Original Message-----
From: Lochschmied, Alexander [] 
Sent: Wednesday, September 24, 2014 9:06 AM
Subject: Spellchecking and suggesting part numbers

Hello Solr Users,

we are trying to get suggestions for part numbers using the spellchecker.

Problem scenario:

ABCD1234 // This is the search term
ABCE1234 // This is what we get from spellchecker
ABCD1244 // This is what we would like to get from spellchecker

Characters towards the left of our part numbers are more relevant.

The setup is:

	<searchComponent name="spellcheck_part" class="solr.SpellCheckComponent">
		<lst name="spellchecker">
			<str name="classname">solr.IndexBasedSpellChecker</str>
			<str name="spellcheckIndexDir">./spellchecker</str>
			<str name="field">did_you_mean_part</str>
	<requestHandler name="/spell_part" class="solr.SearchHandler" startup="lazy">
		<lst name="defaults">
			<str name="df">did_you_mean_part</str>
			<str name="spellcheck">on</str>
		<arr name="last-components">

	<fieldType name="did_you_mean_part" class="solr.TextField" positionIncrementGap="100">
		<analyzer type="index">
			<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[\s]+" replacement=""/>
			<tokenizer class="solr.WhitespaceTokenizerFactory"/>
			<filter class="solr.LowerCaseFilterFactory"/>
			<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" side="front"/>
			<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
		<analyzer type="query">
			<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[\s]+" replacement=""/>
			<tokenizer class="solr.KeywordTokenizerFactory"/>
			<filter class="solr.LowerCaseFilterFactory"/>
			<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" side="front"/>

Can we tweak the setup such that we should get more relevant part numbers?


View raw message