lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4317) Field.java does not reuse its inlined Keyword-TokenStream
Date Tue, 21 Aug 2012 12:12:38 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438639#comment-13438639
] 

Michael McCandless commented on LUCENE-4317:
--------------------------------------------

+1 to the current patch and separately debating how to handle the long-standing NOT_ANALYZED
trap.

I do think StringField serves an important purpose though (it sets DOCS_ONLY, turns of norms),
in addition to not analyzing.  But I don't like that it has its own little analyzer/tokenstream
inside: it would be much better to simply use KeywordAnalyzer.

Or (crazy idea): maybe we could simply call on the analyzer (like we do for normal tokenized
fields), but then insist what was returned is in fact from KeywordAnalyzer?  This would force
users of StringField to use PFAW w/ this field mapping to KeywordAnalyzer.  It's rather...
anal though.  And will be confusing to users who "forget" to use PFAW (but then this is a
service to them: it points out that at query-time their analysis is wrong).  Advanced users
are free to use Field directly if somehow this checking becomes a problem ...
                
> Field.java does not reuse its inlined Keyword-TokenStream
> ---------------------------------------------------------
>
>                 Key: LUCENE-4317
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4317
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.0-BETA
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 5.0, 4.0
>
>         Attachments: LUCENE-4317.patch
>
>
> Field.java contains a inlined Keyword-TokenStream. Unfortunately this one is recreated
all the time, although one reuses the same Field instance. For NumericTokenStream Field.java
reuses it, but the Keyword one not.
> We should apply the same logic and lazy init the TokenStream with a setter for the String
value and reset(). This would be looking identical to SetNumeric(xx).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message