lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Allison (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-5470) Refactoring multiterm analysis
Date Mon, 24 Feb 2014 19:35:19 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910711#comment-13910711
] 

Tim Allison edited comment on LUCENE-5470 at 2/24/14 7:33 PM:
--------------------------------------------------------------

First version of patch.  

I used the code from AnalyzingQueryParser as the template.

If we want to do this consolidation, some questions:

1) should analyzeMultitermTerm be static?
2) is the unchecked IllegalArgumentException the way to go if zero or more than one token
is generated by the analyzer?
3) should we try to coalesce around one name: analyzeMultiTerm vs AnalyzeMultitermTerm?

Thank you.


was (Author: tallison@mitre.org):
First version of patch.

If we want to do this consolidation, some questions:

1) should analyzeMultitermTerm be static?
2) is the unchecked IllegalArgumentException the way to go if zero or more than one token
is generated by the analyzer?
3) should we try to coalesce around one name: analyzeMultiTerm vs AnalyzeMultitermTerm?

Thank you.

> Refactoring multiterm analysis
> ------------------------------
>
>                 Key: LUCENE-5470
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5470
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>    Affects Versions: 5.0
>            Reporter: Tim Allison
>            Priority: Minor
>         Attachments: LUCENE-5470.patch
>
>
> There are currently three methods to analyze multiterms in Lucene and Solr:
> 1) QueryParserBase
> 2) AnalyzingQueryParser
> 3) TextField (Solr)
> The code in QueryParserBase and in TextField do not consume the tokenstream if more than
one token is generated by the analyzer.  (Admittedly, thanks to the magic of MultitermAwareComponents
in Solr, this type of exception probably never happens and the unconsumed stream problem is
probably non-existent in Solr.)
> I propose consolidating the multiterm analysis code into one place: QueryBuilder in Lucene
core.
> This is part of a refactoring that will also help reduce duplication of code with LUCENE-5205.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message