lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pahomov <epaho...@griddynamics.com>
Subject Impossibility to pass filedName to analysers through TokenizerChain::getStream()
Date Wed, 21 Mar 2012 11:49:00 GMT
    I have different stop-word dictionaries per field, but all these fields
are captured by the single dynamic field i.e. single field type i.e. single
analyser.

    It seems I need an improved TokenFilter, which is aware of the field
name, which it analyzes. Now filedName is passed into
TokenizerChain.getStream(), but it's not used there. How I can pass
filedName to token filters?

    I'm thinking of adding a new method TokenStream create(String field,
TokenStream input) into TokenFilterFactory interface, then implement it in
BaseTokenFilterFactory via calling the single argument create(TokenStream
input). After that I'd be able to pass fieldName to TokenFilterFactory in
TokenizerChain.getStream(String fieldName, Reader reader). As an
alternative I can introduce FieldAwareTokenFilterFactory interface with two
args create() and use "instanceof" in TokenizerChain.getStream().
    Is it a good solution for my problem?

    Egor

Mime
View raw message