lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: new TokenStream api Question
Date Sun, 26 Apr 2009 22:18:07 GMT

regardless of it, I really do not understand the  call to initTermBuffer() in termLength()?
What is it good for?

this method will return the same value in both cases, zero,  I see no harm in removing it?

  /** Return number of valid characters (length of the term)
   *  in the termBuffer array. */
  public int termLength() {
    initTermBuffer();
    return termLength;
  }



----- Original Message ----
> From: Uwe Schindler <uwe@thetaphi.de>
> To: java-dev@lucene.apache.org
> Sent: Sunday, 26 April, 2009 23:03:06
> Subject: RE: new TokenStream api Question
> 
> There is one problem: if you extend TermAttribute, the class is different
> (which is the key in the attributes list). So when you initialize the
> TokenStream and do a
> 
> YourClass termAtt = (YourClass) addAttribute(YourClass.class)
> 
> ...you create a new attribute. So one possibility would be to also specify
> the instance and save the attribute by class (as key), but with your
> instance. If you are the first one that creates the attribute (if it is a
> token stream and not a filter it is ok, you will be the first, it adding the
> attribute in the ctor), everything is ok. Register the attribute by yourself
> (maybe we should add a specialized addAttribute, that can specify a instance
> as default)?:
> 
> YourClass termAtt = new YourClass();
> attributes.put(TermAttribute.class, termAtt);
> 
> In this case, for the indexer it is a standard TermAttribute, but you can
> more with it.
> 
> Replacing TermAttribute by an own class is not possible, as the indexer will
> get a ClassCastException when using the instance retrieved with
> getAttribute(TermAttribute.class).
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> > -----Original Message-----
> > From: eks dev [mailto:eksdev@yahoo.co.uk]
> > Sent: Sunday, April 26, 2009 10:39 PM
> > To: java-dev@lucene.apache.org
> > Subject: new TokenStream api Question
> > 
> > 
> > I am just looking into new TermAttribute usage and wonder what would be
> > the best way to implement PrefixFilter that would filter out some Terms
> > that have some prefix,
> > 
> > something like this, where '-' represents my prefix:
> > 
> >   public final boolean incrementToken() throws IOException {
> >     // the first word we found
> >     while (input.incrementToken()) {
> >       int len = termAtt.termLength();
> > 
> >       if(len > 0 && termAtt.termBuffer()[0]!='-') //only length >
0 and
> > non LFs
> >     return true;
> >       // note: else we ignore it
> >     }
> >     // reached EOS
> >     return false;
> >   }
> > 
> > 
> > 
> > 
> > 
> > The question would be:
> > 
> > can I extend TermAttribute and add boolean startsWith(char c);
> > 
> > The point is speed and my code gets smaller.
> > TermAttribute has one method called in termLength() and termBuffer() I do
> > not understand (back compatibility, I guess)
> >   public int termLength() {
> >     initTermBuffer(); // I'd like to avoid it...
> >     return termLength;
> >   }
> > 
> > 
> > I'd like to get rid of initTermBuffer(), the first option is to *extend*
> > TermAttribute code (but fields are private, so no help there) or can I
> > implement my own MyTermAttribute (will Indexer know how to deal with it?)
> > 
> > Must I extend TermAttribute or I can add my own?
> > 
> > thanks,
> > eks
> > 
> > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org



      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message