lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Potharaju <jspothar...@gmail.com>
Subject Re: Custom field using PatternCaptureGroupFilterFactory
Date Mon, 07 Mar 2016 17:20:38 GMT
Thanks Jack, the problem was my regex. Following regex worked.
<filter class="solr.PatternCaptureGroupFilterFactory" pattern=
"([a-zA-Z0-9]{1})" preserve_original="false"/>
Jay

On Sun, Mar 6, 2016 at 7:43 PM, Jack Krupansky <jack.krupansky@gmail.com>
wrote:

> The filter name, "Capture Group", says it all - only pattern groups are
> captured and you have not specified even a single group. See the example:
>
> http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/pattern/PatternCaptureGroupFilterFactory.html
>
> Groups are each enclosed within parentheses, as shown in the Javadoc
> example above.
>
> Since no groups were found, the filter doc applied this rule:
> "If none of the patterns match, or if preserveOriginal is true, the
> original token will be preserved."
>
> http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/pattern/PatternCaptureGroupTokenFilter.html
>
> That should probably also say "or if no pattern groups match".
>
> To test regular expressions, try an interactive online tool, such as:
> https://regex101.com/
>
> -- Jack Krupansky
>
> On Sun, Mar 6, 2016 at 7:51 PM, Alexandre Rafalovitch <arafalov@gmail.com>
> wrote:
>
> > I don't see the brackets that mark the group you actually want to
> > capture. As per:
> >
> >
> http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/pattern/PatternCaptureGroupTokenFilter.html
> >
> > I am also not sure if you actually need "{0,1}" part.
> >
> > Regards,
> >    Alex.
> > ----
> > Newsletter and resources for Solr beginners and intermediates:
> > http://www.solr-start.com/
> >
> >
> > On 7 March 2016 at 04:25, Jay Potharaju <jspotharaju@gmail.com> wrote:
> > > Hi,
> > > I have a custom field for getting the first letter of an firstname. For
> > > this I am using PatternCaptureGroupFilterFactory.
> > > This is not working as expected, not able to parse the data and get the
> > > first character for the string. Any suggestions on how to fix this?
> > >
> > >  <fieldType class="solr.TextField" name="text_firstLetter">
> > >
> > >       <analyzer>
> > >
> > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >
> > >         <filter class="solr.UpperCaseFilterFactory"/>
> > >
> > >         <filter class="solr.PatternCaptureGroupFilterFactory" pattern=
> > > "^[a-zA-Z0-9]{0,1}" preserve_original="false"/>
> > >
> > >        </analyzer>
> > >
> > >     </fieldType>
> > >
> > > --
> > > Thanks
> > > Jay
> >
>



-- 
Thanks
Jay Potharaju

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message