lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Created: (SOLR-2051) analysis.jsp is incorrect for protWords etc
Date Mon, 16 Aug 2010 18:00:27 GMT
analysis.jsp is incorrect for protWords etc
-------------------------------------------

                 Key: SOLR-2051
                 URL: https://issues.apache.org/jira/browse/SOLR-2051
             Project: Solr
          Issue Type: Bug
          Components: web gui
    Affects Versions: 3.1, 4.0
            Reporter: Robert Muir


Analysis.jsp gives the incorrect results if you use "protwords.txt" or "stemdict.txt" or the
like.

This is because this is now implemented with KeywordAttribute (so you can easily override
any stemmer etc).

For example, if your schema had "foobars" in protwords.txt, analysis.jsp would show it being
stemmed to "foobar", even though this doesnt actually happen.

The problem is that this jsp is downconverting the entire tokenstream to Token in between
processing, so it silently discards KeywordAttribute and you get the wrong result.

Note: this issue isnt about *displaying* other attributes such as KeywordAttribute (which
would be a new feature). Its about not throwing them away so that the analysis actually represents
what happens.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message