lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Lassau (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1151) Fix StandardAnalyzer to not mis-identify HOST as ACRONYM by default
Date Fri, 05 Sep 2008 06:47:44 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628573#action_12628573
] 

Mark Lassau commented on LUCENE-1151:
-------------------------------------

I love the solution you have come up with, but would suggest that it is moved to StandardTokenizer
instead of StandardAnalyzer.
StandardTokenizer is the class with the actual problem. Fixing it there would mean that everyone
that uses StandardTokenizer gets a default fix, not just StandardAnalyzer.

For instance, see LUCENE-1373, where most of the contrib Analyzers still suffer the buggy
behavior with no workaround available.
I think that moving your "defaulting logic" to the tokenizer would fix all these Analyzers
in one fell swoop.

I would provide suggested patches, but I am just about to go on holidays for 3 weeks. Is there
a planned release date for v2.3.3 or v2.4?

> Fix StandardAnalyzer to not mis-identify HOST as ACRONYM by default
> -------------------------------------------------------------------
>
>                 Key: LUCENE-1151
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1151
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.4
>
>         Attachments: LUCENE-1151.patch
>
>
> Coming out of the discussion around back compatibility, it seems best to default StandardAnalyzer
to properly fix LUCENE-1068, while preserving the ability to get the back-compatible behavior
in the rare event that it's desired.
> This just means changing the replaceInvalidAcronym = false with = true, and, adding a
clear entry to CHANGES.txt that this very slight non back compatible change took place.
> Spinoff from here:
>     http://www.gossamer-threads.com/lists/lucene/java-dev/57517#57517
> I'll commit that change in a day or two.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message