lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Osullivan L. <>
Subject RE: charFilter
Date Thu, 13 Sep 2012 10:43:14 GMT
Hi Folks,

I'm getting the following error after using a custom filter:

SEVERE: org.apache.solr.common.SolrException:
Token PR  2823.000000 A0.200000 S0.819880 exceeds length of provided text sized 15

As the error suggests, the input value is PR2823.A2S81988 (15 chars). I have been informed
that correctOffset() method of the CharFilter class can be used to resolve this issue but
as far as I can tell, all that does is return the value - it doesn't set it. 

I have included some details below.

Kind Regards,


In my schema I have:

    <fieldType name="LCNormalized" class="solr.TextField" sortMissingLast="true" omitNorms="true">
          <charFilter class="com.test.solr.analysis.LukesTestCharFilterFactory"/>
          <tokenizer class="solr.KeywordTokenizerFactory"/>

and the method is:

public class LukesTestCharFilterFactory extends BaseCharFilterFactory {

	public CharStream create(CharStream input) {
		return new LukesTestCharFilter(input);

public final class LukesTestCharFilter extends BaseCharFilter
  public LukesTestCharFilter(CharStream input)  {
	  try {
          // Load the whole input into a string
          StringBuilder sb = new StringBuilder();
          char[] buf = new char[1024];

          int len;
          while ((len = >= 0) {
              sb.append(buf, 0, len);

          String original = sb.toString();
          String modified = getLCShelfkey(original);
          CharStream result = CharReader.get(new StringReader(modified));

          this.input = result;
      } catch (IOException e) {
          System.err.println("There was a problem parsing input.  Skipping.");

View raw message