lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Blythe <j...@curvolabs.com>
Subject Re: KeepWord
Date Tue, 02 Feb 2016 13:30:52 GMT
nice tip. i appreciate it!

-- 
*John Blythe*
Product Manager & Lead Developer

251.605.3071 | john@curvolabs.com
www.curvolabs.com

58 Adams Ave
Evansville, IN 47713

On Mon, Feb 1, 2016 at 4:55 PM, Erik Hatcher <erik.hatcher@gmail.com> wrote:

> And if you want to have the “kept” words stored, consider the trick used
> in example/files for url/e-mail extraction mentioned here (note the related
> fix in the patch in the JIRA issue mentioned):
>
>    https://lucidworks.com/blog/2016/01/27/example_files/ <
> https://lucidworks.com/blog/2016/01/27/example_files/>
>
>
>
>
> > On Feb 1, 2016, at 3:23 PM, John Blythe <john@curvolabs.com> wrote:
> >
> > i immediately realized after sending that i'd had stored="true" in the
> > field's config and that it was storing the original data, not the
> processed
> > data. silly me, thanks anyway!
> >
> > --
> > *John Blythe*
> > Product Manager & Lead Developer
> >
> > 251.605.3071 | john@curvolabs.com
> > www.curvolabs.com
> >
> > 58 Adams Ave
> > Evansville, IN 47713
> >
> > On Mon, Feb 1, 2016 at 3:18 PM, John Blythe <john@curvolabs.com> wrote:
> >
> >> hi all,
> >>
> >> i'm having trouble with what would seem to be a pretty straightforward
> >> filter.
> >>
> >> i'm trying to 'tag' documents based off of a list of relevant words
> that a
> >> description field may contain. if the data contains any of the words
> then
> >> this field is populated with it and acts as a quick reference for
> >> relevant/bucketed documents.
> >>
> >> i receive no errors when reloading the core or indexing the data. each
> >> document, however, has its description listed in this tag field *even if
> >> none of the targeted words are in it.*
> >>
> >> here's the analyzer, tokenizer, and filter:
> >>
> >> <analyzer>
> >>        <tokenizer class="solr.StandardTokenizerFactory" />
> >>        <filter class="solr.KeepWordFilterFactory" words="tags.txt"
> >> ignoreCase="true"/>
> >> </analyzer>
> >>
> >> to add to the confusion, when i run test data through both of the
> >> appropriate FieldName/FieldType in the Analysis UI I get the expected
> >> results: the non-targeted words are left out of processing.
> >>
> >> thanks for any info/help-
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message