lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: SynonymFilter, FST, and Aho-Corasick algorithm
Date Thu, 12 Jul 2012 17:12:54 GMT
On Thu, Jul 12, 2012 at 12:19 PM, Michael McCandless
<> wrote:
> On Thu, Jul 12, 2012 at 12:10 PM, Dawid Weiss <> wrote:
>> The development was too fast for me to keep up. And by the time i had some
>> concept of the api mike wrote about million lines of code that would have to
>> be rewritten ;)
> Mike is very happy to help rewrite that code for a better FST API :)
> We can and should also make incremental improvements.
> I do agree it's horrible to have code that only a small set of people
> understand: such code is effectively dead.

I agree, and think we should make improvements to the API whenever we can.

but a fast, efficient, and self-documenting FST API is probably going
to be elusive

I think a lot of this could be fixed with examples and docs, which
we've been working at too, e.g.:

The biggest problem I have with documentation here is when it becomes
out-of-date. We've made a lot of progress here, we have a
javadocs-lint task that runs in hudson and checks all of our links and
fails if any are dead, etc.

But this does us no good for code samples. I think we need to
seriously revisit/develop a plan for code samples in documentation.
All the samples we have in various docs (e.g. package documentation)
is very fragile, and it discourages me totally from adding any
advanced examples or any more than are minimally necessary to get
started, because I'm afraid of the manual maintenance cost.

Instead I think we should setup a proper examples infrastructure,
where these examples are actually compiled and such. We can still link
to them in javadocs.
Have a look at this example from the demo/ module:

I think we should have more than just SearchFiles and IndexFiles and
also move our examples here, rather than being inlined in the javadocs
text. This way they are compile-time checked, and we can link to them
from anywhere (its safe, and we have link-checkers that prove it).

I'm open to any other ideas though: this is just the best one i have now.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message