lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradeep Pujari <Prade...@rocketmail.com>
Subject ShingleFilterFactory class gives bi-words
Date Thu, 28 Jul 2011 17:34:03 GMT
Hi,

I am trying to get create shingles with minShingleSize = 10, but it also returns bi-grams
too. Heres is my schema defn
 			<filter class="solr.ShingleFilterFactory" minShingleSize="10" maxShingleSize="25"
 				outputUnigrams="false" outputUnigramsIfNoShingles="false" tokenSeparator=" "/>


For the input String "Apple - iPad 3G Wi-Fi - 32GB", it breaks into
	Apple -
	- iPad	
iPad 3G	3G &43;	&43; Wi-Fi	Wi-Fi -	- 32GB
Apple - iPad 3G

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message