lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Busch (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-580) Pre-analyzed fields
Date Sat, 28 Apr 2007 18:42:15 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael Busch updated LUCENE-580:
---------------------------------

    Attachment: lucene-580.patch

Karl,

thanks again for your suggestions. I created a patch with a slightly different approach compared
to your latest patch. Similar to java.io.Reader I added the public method reset() to TokenStream,
which does nothing per default. Subclasses may or may not overwrite this method. I also added
the new class CachingTokenFilter to the analysis package which does the same as your CachedPreAnalyzedField.
Before the DocumentWriter consumes the TokenStream it calls reset() reposition the stream
at the beginning.

With this approach it is not neccessary anymore to introduce a TokenStreamFactory and the
PreAnalyzedField classes, which is simpler and more consistent with the Analyzer API in my
opinion. Yet it also allows to consume the Tokens of a stream more than once, which should
satisfy your needs?

Please let me know what you thing about this new patch. Maybe other committers could take
a look as well, since this is an API change (well, extension) to two very common classes:
TokenStream and Field.

> Pre-analyzed fields
> -------------------
>
>                 Key: LUCENE-580
>                 URL: https://issues.apache.org/jira/browse/LUCENE-580
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.9
>            Reporter: Karl Wettin
>         Assigned To: Michael Busch
>            Priority: Minor
>         Attachments: lucene-580.patch, preanalyze.tar, trunk.diff
>
>
> Adds the possibility to set a TokenStream at Field constrution time, available as tokenStreamValue
in addition to stringValue, readerValue and binaryValue.
> There might be some problems with mixing stored fields with the same name as a field
with tokenStreamValue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message