lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <>
Subject new TokenStream api Question
Date Sun, 26 Apr 2009 20:38:34 GMT

I am just looking into new TermAttribute usage and wonder what would be the best way to implement
PrefixFilter that would filter out some Terms that have some prefix, 

something like this, where '-' represents my prefix:

  public final boolean incrementToken() throws IOException {
    // the first word we found
    while (input.incrementToken()) {
      int len = termAtt.termLength();
      if(len > 0 && termAtt.termBuffer()[0]!='-') //only length > 0 and non
    return true;
      // note: else we ignore it
    // reached EOS 
    return false;


The question would be:

can I extend TermAttribute and add boolean startsWith(char c);

The point is speed and my code gets smaller.  
TermAttribute has one method called in termLength() and termBuffer() I do not understand (back
compatibility, I guess)
  public int termLength() {
    initTermBuffer(); // I'd like to avoid it...
    return termLength;

I'd like to get rid of initTermBuffer(), the first option is to *extend*  TermAttribute code
(but fields are private, so no help there) or can I implement my own MyTermAttribute (will
Indexer know how to deal with it?) 

Must I extend TermAttribute or I can add my own?



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message