incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dibyendu Bhattacharya <dibyendu.bhattach...@gmail.com>
Subject Re: Custom EdgeNGram Analyzer For Blur Text Field
Date Thu, 15 May 2014 04:33:24 GMT
HI ,

Thanks to Aaron and Garret for your emails. I am able to  configure custom
EdgeNGramAnalyzer for a text filed using "analyzerClass" property. I would
request to kindly document it somewhere in Blur 0.2.2.

I also tried the Custom Type Definition as suggested by Garret which use
the EdgeNGramAnalyzer . This same type I defined in blur-site.xml and
defined a ColumnDefinition which use this custom type. But while query I
faced some issue as it prompt some message like ,  "filed is of custom type
and needs to be enclosed in "" .

Anyway, I found that "analyzerClass" option is much easier to configure and
that worked fine for me.

Regards,
Dibyendu



On Wed, May 14, 2014 at 7:33 AM, Aaron McCurry <amccurry@gmail.com> wrote:

> Garrett is correct that you can create a custom type.  However you are
> correct in that you can specify the "analyzerClass" property if and only if
> there are one of two different types of constructors.  The default
> constructor (no args) or one that takes the LuceneVersion enum.  Otherwise
> it will throw an exception.  This also assumes that you are running a
> fairly recent version of Blur if it's 0.2.2 (which I think you are) then
> you are likely good to use that option.
>
> Here's the code:
>
>
> https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=blur-query/src/main/java/org/apache/blur/analysis/type/TextFieldTypeDefinition.java;h=049207bdb4f94cf03a4b0c74891eba129d13fbbb;hb=3967e154e7b064ad40b36d1d5832b7c7bcac44cd#l69
>
> Perhaps the reason it's not being taken is because the field has already
> been defined for the given table?
>
> If none of those possibilities are the problem I'm not sure what the
> problem could be.  Let us know how it goes.
>
> Aaron
>
>
>
>
> On Tue, May 13, 2014 at 12:12 PM, Garrett Barton
> <garrett.barton@gmail.com>wrote:
>
> > I think you have to create a custom TypeDefinition that calls your
> > analyzer underneath the covers. You can extend the
> TextFieldTypeDefinition
> > if I remember right and just override the analyzer it calls.
> >
> > ~Garrett
> >
> >
> > On Tue, May 13, 2014 at 11:54 AM, Dibyendu Bhattacharya <
> > dibyendu.bhattachary@gmail.com> wrote:
> >
> >> Hi ,
> >>
> >> I was trying to configure a Custom Analyzer ( EgdeNGram) for a text
> field.
> >>
> >> Below is the very simple Edge N Gram Analyzer code with works fine.
> >>
> >> public class EdgeNGramAnalyzer extends Analyzer {
> >>  @Override
> >> protected TokenStreamComponents createComponents(String fieldName,
> Reader
> >> reader) {
> >>     final StandardTokenizer src = new
> StandardTokenizer(Version.LUCENE_43,
> >> reader);
> >>     TokenStream tok = new StandardFilter(Version.LUCENE_43, src);
> >>     tok = new LowerCaseFilter(Version.LUCENE_43, tok);
> >>     tok = new StopFilter(Version.LUCENE_43, tok,
> >> StopAnalyzer.ENGLISH_STOP_WORDS_SET);
> >>     tok = new EdgeNGramTokenFilter(tok,
> >> EdgeNGramTokenFilter.Side.FRONT,3,20);
> >>     return new TokenStreamComponents(src, tok) {
> >>       @Override
> >>       protected void setReader(final Reader reader) throws IOException {
> >>         super.setReader(reader);
> >>       }
> >>     };
> >>   }
> >> }
> >>
> >>
> >> I configured this Analyzer for a CloumnDefination using following steps
> >> via
> >> thrift client..
> >>
> >>         ColumnDefinition customAnalyzerDefn = new ColumnDefinition();
> >>         customAnalyzerDefn.setFamily(FAMILY_NAME);
> >>         customAnalyzerDefn.setColumnName(COLUMN_NAME);
> >>         customAnalyzerDefn.setFieldType("text");
> >>
> >>         Map<String,String> analyzer = new HashMap<String,String>();
> >>         analyzer.put("analyzerClass", "x.y.z.EdgeNGramAnalyzer");
> >>         customAnalyzerDefn.setProperties(analyzer);
> >>
> >>         client.addColumnDefinition(TABLE_NAME, customAnalyzerDefn);
> >>
> >>
> >> I copied the Jar containing the analyzer class into Blur Lib folder.
> >>
> >> But I do not see this analyzer getting called. Blur always using the
> >> default StandardAnalyzer for text field. Kindly let me know if I am
> >> missing
> >> something, or there is an issue that "analyzerClass" property is not
> >> getting set. I found Blur using this key to set the Analyzer
> >> in TextFieldTypeDefinition ..
> >>
> >> Regards,
> >> Dibyendu
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message