lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4317) does not reuse its inlined Keyword-TokenStream
Date Tue, 21 Aug 2012 12:10:38 GMT


Robert Muir commented on LUCENE-4317:

To come back to Robert: We can of course remove the whole FieldTypes stuff altogether and
only do this in Analyzer (which is then somehow a schema replacement). But then Analyzer.tokenStream()
should be able to take other data types than Reader. If this would be the case I would +1
all your suggestions, izt would also solve the stupid wrapping of Strings with StringReader
(that would be needed with removal of StringField) 

Well its sorta always been this way, we tell people to check and make sure they use same analysis
at index and query time. But with the current situation if they use things like StringFields
and QueryParser they follow our best practices and get bogus or no results, thats why I get
frustrated (this NOT_ANALYZED confusion has come up on the ML many times)

By the way as far as taking other data types than reader, i have no idea, i certainly feel
like if we want to take String we could just apply StringReader ourself by default (someone
could override). The problem with taking something other than reader would be the existence
of charfilters in the chain (as these work on only readers).

dont care about QueryParser, because when you indexed a numeric field, you should create a
Query on your own and not use this query parser, that corrumpts everything because of its
syntax and whitespace handling.

But you added support for this to the flexible QP last summer as part of GSOC right (LUCENE-1768)
? so its possible with Lucene's syntax? I'm confused I guess.

> does not reuse its inlined Keyword-TokenStream
> ---------------------------------------------------------
>                 Key: LUCENE-4317
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0-BETA
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 5.0, 4.0
>         Attachments: LUCENE-4317.patch
> contains a inlined Keyword-TokenStream. Unfortunately this one is recreated
all the time, although one reuses the same Field instance. For NumericTokenStream
reuses it, but the Keyword one not.
> We should apply the same logic and lazy init the TokenStream with a setter for the String
value and reset(). This would be looking identical to SetNumeric(xx).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message