lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rodrigo Reyes" <re...@charabia.net>
Subject Re: Normalization
Date Mon, 11 Mar 2002 22:00:15 GMT
Hi Brian,

> Great stuff, Rodrigo!  Welcome.

Thanks :-)

> will stop working.  So any such filtering language should produce code
> (or data) that becomes part of the program, rather than simply a
> configuration file along with the program.  In other words, it should
> be considered source code, not configuration data.

 Good point. I had this drawback in mind, but I am not totally convinced
that the compilation process is really a good protection barrier, I'd rather
rely on educational explanations and warnings. However, while the parser &
interpreter are already written, it shouldn't be that hard to write a
source-code generator (at least, it'd make it more efficient/faster, and
that's not something I can be against). I mainly wanted to write it and the
french normalizer as a proof-of-concept.

> Great idea!  We'd love to have something like this.  This is the sort
> of contribution we're really looking for.  I'm willing to help write
> a parser for it if the langauge gets complicated.

Great, some extension may be needed to describe additional word-length
constraints on rules, and so on, but my belief is that it should stay as
simple as possible.

Anyway, I'll try to add a few comments in the sourcecode (although it's very
small, like 8 small classes) and package it so that the lucene developers
can try it. Should be ready tomorrow.

Rodrigo



--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message