lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: KeywordAnalyzer split into KeywordTokenizer/KeywordAnalyzer
Date Fri, 11 Feb 2005 15:31:12 GMT
Luc - I've added the buffer size constructor and committed.

I don't think, now that it is subclassed Tokenizer, that we need the 
close test that you sent.  But if you feel differently let me know.

	Erik

On Feb 11, 2005, at 10:10 AM, Vanlerberghe, Luc wrote:

> This one works fine.
> My version had a smaller default buffer size and an extra constructor 
> to
> choose the size if wanted:
>
> public class KeywordTokenizer extends Tokenizer {
>   public KeywordTokenizer(Reader input) {
>     this(input, DEFAULT_BUFFER_SIZE);
>   }
>
>   public KeywordTokenizer(Reader input, int bufferSize) {
>     super(input);
>     this.buffer=new char[bufferSize];
>     this.done=false;
>   }
>
>   private static final int DEFAULT_BUFFER_SIZE=256;
>   private final char[] buffer;
>
>   private boolean done;
>
> ... etc
> }
>
> Luc
>
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: vrijdag 11 februari 2005 14:53
> To: Lucene Developers List
> Subject: Re: KeywordAnalyzer split into 
> KeywordTokenizer/KeywordAnalyzer
>
> Luc,
>
> Good point about the Reader close issue - I should have subclassed
> Tokenizer instead of TokenStream.  Oops!
>
> I just made the split and committed.  Any issues with this one?
>
> 	Erik
>
>
> On Feb 11, 2005, at 7:25 AM, Vanlerberghe, Luc wrote:
>
>> Hi all,
>>
>> I found Erik's KeywordAnalyzer very useful (I had just written a
>> similar but more limited one a few hours before him) but I wanted a
>> KeywordTokenizer that I would then be able to use in different
>> circumstances more easily (E.g. chain it to a LowercaseFilter)
>>
>> So I took the liberty to modify his code into a KeywordTokenizer and
>> let the KeywordAnalyzer return an instance of it.
>>  It also solves the problem that the original KeywordAnalyzer never
>> closed its Reader (TokenStream.close() was called implicitly, but that
>
>> has an empty implementation)
>>
>> What is the proper way to sumbit this?
>>  I attached a diff that should be applied in
>> contrib/analyzers/src/java/org/apache/lucene/analysis
>> Should I submit it as an attachment to a Bugzilla report instead?
>>
>> Luc
>>
>>
>>
>>  <<KeywordAnalyzer.diff>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message