lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl (JIRA) <j...@apache.org>
Subject [jira] Created: (SOLR-1978) Create MappingTokenFilterFactory
Date Wed, 30 Jun 2010 19:45:52 GMT
Create MappingTokenFilterFactory
--------------------------------

                 Key: SOLR-1978
                 URL: https://issues.apache.org/jira/browse/SOLR-1978
             Project: Solr
          Issue Type: New Feature
          Components: Schema and Analysis
            Reporter: Jan Høydahl
            Priority: Minor


There is a need for a mapping filter as a counterpart for the MappingCharFilterFactory, but
designed to run after tokenization. It should read the same config file format as the MappingCharFilterFactory
does.

This will be a more generic approach to accent normalization than the ISOLatin1AccentFilterFactory
which is hard coded.

The reason why we need it as a TokenFilter is that sometimes the normalization needs to happen
far down in the analysis chain, because previous filters rely on the original value of the
token, such as stemming, synonyms or other dictionary lookups.

This patch would require a MappingTokenFilter in Lucene as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message