lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Taylor <>
Subject Re: Installing a custom tokenizer
Date Wed, 30 Aug 2006 01:32:11 GMT
Because I wanted to use the javaCC input code from Lucene.  99.99% of 
what the standard parser did was VERY GOOD.  having worked with 
computer-generated compilers in the past, I realized that if I were to 
modify the parser itself, I would eventually get into real trouble.  So 
I took the time to figure out how to use JavaCC, with a great deal of 
help from Mr. Miller.  Now I can make changes much more easily.

That is why I took the trouble to do it that way.

On Aug 29, 2006, at 8:12 PM, yueyu lin wrote:

> Your problem is that StandardTokenizer doesn's fit your requirements.
> Since you know how to implement a new one, just do it.
> If you just want to modify StandardTokenizer, you can get the codes and
> rename it to your class, then modify something that you dislike. I 
> think
> it's a so simple stuff, why do you make it so complicated?
> On 8/29/06, Bill Taylor <> wrote:
>> On Aug 29, 2006, at 7:12 PM, Mark Miller wrote:
>> >
>> > 2. The ParseException that is generated when making the
>> > StandardAnalyzer must be killed because there is another
>> > ParseException class (maybe in queryparser?) that must be used
>> > instead. The lucene build file excludes the StandardAnalyzer
>> > ParseException so that the other one is used. You could prob just
>> > delete it as well but then of course you would have to remember to
>> > delete it every time you rebuilt the javacc file.
>> >
>> If I use the generated parse exception, I get the message that the
>> ParseException which is thrown by next() is incompatible with the
>> throws statement in the generated code.  If I get rid of it and use 
>> the
>> older one, it turns out that generateException cannot generate a
>> ParseException because it passes a generated Token instead of the
>> Lucene Token.
>> It appears that the JavaCC I am using is generating code which is a 
>> bit
>> ahead of Lucene.
>> So I simply generated a new token of hte roper Lucene type for the
>> error message and everything compiled.  Now I have to tell the query
>> parser to use the same analyzer and all will be well.
>> Thanks.
>> The good news is that Lucene is unbelievably wonderful.  The bad news
>> is that Lucene is unbelievably wonderful.
>> Thanks again.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> -- 
> --
> Yueyu Lin

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message