incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dibyendu Bhattacharya <dibyendu.bhattach...@gmail.com>
Subject Custom EdgeNGram Analyzer For Blur Text Field
Date Tue, 13 May 2014 15:54:08 GMT
Hi ,

I was trying to configure a Custom Analyzer ( EgdeNGram) for a text field.

Below is the very simple Edge N Gram Analyzer code with works fine.

public class EdgeNGramAnalyzer extends Analyzer {
 @Override
protected TokenStreamComponents createComponents(String fieldName, Reader
reader) {
    final StandardTokenizer src = new StandardTokenizer(Version.LUCENE_43,
reader);
    TokenStream tok = new StandardFilter(Version.LUCENE_43, src);
    tok = new LowerCaseFilter(Version.LUCENE_43, tok);
    tok = new StopFilter(Version.LUCENE_43, tok,
StopAnalyzer.ENGLISH_STOP_WORDS_SET);
    tok = new EdgeNGramTokenFilter(tok,
EdgeNGramTokenFilter.Side.FRONT,3,20);
    return new TokenStreamComponents(src, tok) {
      @Override
      protected void setReader(final Reader reader) throws IOException {
        super.setReader(reader);
      }
    };
  }
}


I configured this Analyzer for a CloumnDefination using following steps via
thrift client..

        ColumnDefinition customAnalyzerDefn = new ColumnDefinition();
        customAnalyzerDefn.setFamily(FAMILY_NAME);
        customAnalyzerDefn.setColumnName(COLUMN_NAME);
        customAnalyzerDefn.setFieldType("text");

        Map<String,String> analyzer = new HashMap<String,String>();
        analyzer.put("analyzerClass", "x.y.z.EdgeNGramAnalyzer");
        customAnalyzerDefn.setProperties(analyzer);

        client.addColumnDefinition(TABLE_NAME, customAnalyzerDefn);


I copied the Jar containing the analyzer class into Blur Lib folder.

But I do not see this analyzer getting called. Blur always using the
default StandardAnalyzer for text field. Kindly let me know if I am missing
something, or there is an issue that "analyzerClass" property is not
getting set. I found Blur using this key to set the Analyzer
in TextFieldTypeDefinition ..

Regards,
Dibyendu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message