lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stanislaw Osinski (JIRA)" <>
Subject [jira] Commented: (LUCENE-966) A faster JFlex-based replacement for StandardAnalyzer
Date Thu, 02 Aug 2007 08:44:52 GMT


Stanislaw Osinski commented on LUCENE-966:

Thanks for more test cases. I guess the biggest problem here is that the scanner generated
by JavaCC doesn't seem to strictly follow the specification (see,
so I'd need to emulate possible JavaCC "bugs" I'm not aware of at the moment (I'm not an expert
on lexical scanner generation either, not yet at least :). I can add some workarounds to the
grammar to make the known incompatibility examples work, but this won't guarantee consistency
in general.

As a side note, it's a shame there's no trace of the version of JavaCC that was used to generate
the scanner for the original StandardAnalyzer. I'm also curious if the results of the current
JavaCC grammar would be the same with the newest version of the generator (4.0 I guess) --
I'll try to check that.

Anyway, I'll take a look at the problem in more depth once again. And in the worst case scenario,
we can keep the StandardAnalyzer as it was and add the new one next to it so that people can
have a choice (on the other hand, this might be a problem for the quality tests).

> A faster JFlex-based replacement for StandardAnalyzer
> -----------------------------------------------------
>                 Key: LUCENE-966
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Stanislaw Osinski
>             Fix For: 2.3
>         Attachments:, jflex-analyzer-patch.txt, jflex-analyzer-r560135-patch.txt,
jflex-analyzer-r561292-patch.txt, jflex-analyzer-r561693-compatibility.txt
> JFlex ( can be used to generate a faster (up to several times) replacement
for StandardAnalyzer. Will add a patch and a simple benchmark code in a while.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message