lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carrie Coy (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-4122) EnglishMinimalStemmer incorrectly tokenizes words ending in "hes" and "xes"
Date Wed, 28 Nov 2012 21:01:58 GMT
Carrie Coy created SOLR-4122:
--------------------------------

             Summary: EnglishMinimalStemmer incorrectly tokenizes words ending in "hes" and
"xes"
                 Key: SOLR-4122
                 URL: https://issues.apache.org/jira/browse/SOLR-4122
             Project: Solr
          Issue Type: Bug
          Components: Schema and Analysis
    Affects Versions: 4.0
            Reporter: Carrie Coy


Stemmer tokenizes "dishes" to "dishe"  and boxes to "boxe".   Seems like this addition would
fix it.

case 'e':
        if (len > 3 && s[len-3] == 'i' && s[len-4] != 'a' && s[len-4]
!= 'e') {
          s[len - 3] = 'y';
          return len - 2;
        }
        *if (s[len-3] == 'x' || s[len-3] == 'h' )
          return len-2;*


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message