lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Spyros Kapnissis (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-3508) Decompounders based on CompoundWordTokenFilterBase cannot be used with custom attributes
Date Sun, 23 Oct 2011 00:40:32 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Spyros Kapnissis updated LUCENE-3508:
-------------------------------------

    Attachment: LUCENE-3508.patch

I am attaching a patch that uses the TokenStream API instead of Token for the decompounders
(CompoundWordTokenFilterBase, DictionaryCompoundWordTokenFilter, HyphenationCompoundWordTokenFilter).Non-common
attributes are now being passed on correctly through the analyzer chain.  unit test of the
functionality is included.

Hope it helps..any improvements are of course welcome.
                
> Decompounders based on CompoundWordTokenFilterBase cannot be used with custom attributes
> ----------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3508
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3508
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: modules/analysis
>    Affects Versions: 3.4, 4.0
>            Reporter: Spyros Kapnissis
>         Attachments: LUCENE-3508.patch
>
>
> The CompoundWordTokenFilterBase.setToken method will call clearAttributes() and then
will reset only the default Token attributes (term, position, flags, etc) resulting in any
custom attributes losing their value. Commenting out clearAttributes() seems to do the trick,
but will fail the TestCompoundWordTokenFilter tests..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message