lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vanlerberghe, Luc" <Luc.Vanlerber...@bvdep.com>
Subject RE: KeywordAnalyzer split into KeywordTokenizer/KeywordAnalyzer
Date Fri, 11 Feb 2005 15:10:24 GMT
This one works fine.
My version had a smaller default buffer size and an extra constructor to
choose the size if wanted:

public class KeywordTokenizer extends Tokenizer {
  public KeywordTokenizer(Reader input) {
    this(input, DEFAULT_BUFFER_SIZE);
  }
  
  public KeywordTokenizer(Reader input, int bufferSize) {
    super(input);
    this.buffer=new char[bufferSize];
    this.done=false;
  }
  
  private static final int DEFAULT_BUFFER_SIZE=256;
  private final char[] buffer;

  private boolean done;

... etc
}

Luc



-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: vrijdag 11 februari 2005 14:53
To: Lucene Developers List
Subject: Re: KeywordAnalyzer split into KeywordTokenizer/KeywordAnalyzer

Luc,

Good point about the Reader close issue - I should have subclassed
Tokenizer instead of TokenStream.  Oops!

I just made the split and committed.  Any issues with this one?

	Erik


On Feb 11, 2005, at 7:25 AM, Vanlerberghe, Luc wrote:

> Hi all,
>
> I found Erik's KeywordAnalyzer very useful (I had just written a 
> similar but more limited one a few hours before him) but I wanted a 
> KeywordTokenizer that I would then be able to use in different 
> circumstances more easily (E.g. chain it to a LowercaseFilter)
>
> So I took the liberty to modify his code into a KeywordTokenizer and 
> let the KeywordAnalyzer return an instance of it.
>  It also solves the problem that the original KeywordAnalyzer never 
> closed its Reader (TokenStream.close() was called implicitly, but that

> has an empty implementation)
>
> What is the proper way to sumbit this?
>  I attached a diff that should be applied in 
> contrib/analyzers/src/java/org/apache/lucene/analysis
> Should I submit it as an attachment to a Bugzilla report instead?
>
> Luc
>
>
>
>  <<KeywordAnalyzer.diff>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message