lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nikolay Khitrin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8265) WordDelimiterFilter should pass through terms marked as keywords
Date Mon, 23 Apr 2018 13:21:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448135#comment-16448135
] 

Nikolay Khitrin commented on LUCENE-8265:
-----------------------------------------

This is the breaking change.

For example keyword attribute can be used for bypass stemming (as mentioned in KeywordAttribute
javadoc) _after_ WordDelimiterFilter.

Should be at least marked as breaking in changelog. Might be better solution is to provide
this as an option for delimiter filter.

> WordDelimiterFilter should pass through terms marked as keywords
> ----------------------------------------------------------------
>
>                 Key: LUCENE-8265
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8265
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Mike Sokolov
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This will help in cases where some terms containing separator characters should be split,
but others should not.  For example, this will enable a filter that identifies things that
look like fractions and identifies them as keywords so that 1/2 does not become 12, while
doing splitting and joining on terms that look like part numbers containing slashes, eg something
like "sn-999123/1" might sometimes be written "sn-999123-1".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message