lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll (JIRA)" <>
Subject [jira] Updated: (LUCENE-1058) New Analyzer for buffering tokens
Date Wed, 21 Nov 2007 02:53:43 GMT


Grant Ingersoll updated LUCENE-1058:

    Attachment: LUCENE-1058.patch

Here's a patch that modifies the DocumentsWriter to not throw an IllegalArgumentException
if no Reader is specified.  Thus, an Analyzer needs to be able to handle a null Reader (this
still needs to be documented).  Basically, the semantics of it are that the Analyzer is producing
Tokens from some other means.  I probably should spell this out in a new Field constructor
as well, but this should suffice for now, and I will revisit it after the break.

 I also added in a TestCollaboratingAnalyzer.  All tests pass.

> New Analyzer for buffering tokens
> ---------------------------------
>                 Key: LUCENE-1058
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: LUCENE-1058.patch, LUCENE-1058.patch
> In some cases, it would be handy to have Analyzer/Tokenizer/TokenFilters that could siphon
off certain tokens and store them in a buffer to be used later in the processing pipeline.
> For example, if you want to have two fields, one lowercased and one not, but all the
other analysis is the same, then you could save off the tokens to be output for a different
> Patch to follow, but I am still not sure about a couple of things, mostly how it plays
with the new reuse API.
> See;#54397

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message