lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] [Updated] (LUCENE-4317) does not reuse its inlined Keyword-TokenStream
Date Tue, 21 Aug 2012 09:08:38 GMT


Uwe Schindler updated LUCENE-4317:

    Attachment: LUCENE-4317.patch

- Streamlines handling of NumericTokenStreanm and a new internal StringTokenStream. Previous
code was not easy to read.
- Reuse of both (of course, not accross field instances). This improves the performance of
reused Field instances enormous, as creating a new TokenStream for each small String is heavy
(2 LinkedHashMaps, addition of attributes,...)

We should maybe think about a solution how to "cache" the instances across several instances
(like IndexWriter did in 3.x). I was thinking about a singleton Analyzer...

For now this patch helps a lot.
> does not reuse its inlined Keyword-TokenStream
> ---------------------------------------------------------
>                 Key: LUCENE-4317
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0-BETA
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 5.0, 4.0
>         Attachments: LUCENE-4317.patch
> contains a inlined Keyword-TokenStream. Unfortunately this one is recreated
all the time, although one reuses the same Field instance. For NumericTokenStream
reuses it, but the Keyword one not.
> We should apply the same logic and lazy init the TokenStream with a setter for the String
value and reset(). This would be looking identical to SetNumeric(xx).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message