lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Rowe <>
Subject RE: StandardTokenizer generation from JFlex grammar
Date Thu, 04 Oct 2012 23:56:28 GMT
Hi Phani,

Assuming you're using Lucene 3.6.X, see:




I've pasted the relevant contents below:

WARNING: if you change StandardTokenizerImpl*.jflex or UAX29URLEmailTokenizer
and need to regenerate the tokenizer, only use the trunk version
of JFlex 1.5 (with a minimum SVN revision 597) at the moment!
Please install the jFlex 1.5 version (currently not released)
from its SVN repository:

 svn co jflex
 cd jflex
 mvn install

Then, create a file either in your home
directory, or within the Lucene directory and set the jflex.home
property to the path where the JFlex trunk checkout is located
(in the above example its the directory called "jflex"). 


-----Original Message-----
From: vempap [] 
Sent: Thursday, October 04, 2012 7:43 PM
Subject: StandardTokenizer generation from JFlex grammar


  I'm trying to generate the standard tokenizer again using the jflex
specification (StandardTokenizerImpl.jflex) but I'm not able to do so due to
some errors (I would like to create my own jflex file using the standard
tokenizer which is why I'm trying to first generate using that to get a hang
of things).

I'm using jflex 1.4.3 and I ran into the following error:

Error in file "<filename>" (line 64): 
Syntax error.
HangulEx       = (!(!\p{Script:Hangul}|!\p{WB:ALetter})) ({Format} |

Also, I tried installing an eclipse plugin from which I thought would provide
options similar to JavaCC ( through
which we can generate classes within eclipse - but had a hard luck.

Any help would be very helpful.


View this message in context:
Sent from the Lucene - Java Developer mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message