lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benson Margulies (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LUCENE-5386) Make Tokenizers deliver their final offsets
Date Mon, 06 Jan 2014 20:18:50 GMT
Benson Margulies created LUCENE-5386:
----------------------------------------

             Summary: Make Tokenizers deliver their final offsets
                 Key: LUCENE-5386
                 URL: https://issues.apache.org/jira/browse/LUCENE-5386
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Benson Margulies


Tokenizers _must_ have an implementation of #end() in which they set up the final offset.
Currently, nothing enforces this. end() has a useful implementation in TokenStream, so just
making it abstract is not attractive.

Proposal: add

  abstract int finalOffset(); 

to tokenizer, and then make

    void end() {
        super.end();
        int fo = finalOffset();
       offsetAttr.setOffsets(fo, fo);
   }

or something to that effect.

Other alternative to be considered depending on how this looks.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message