incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-dev] RegexTokenizer
Date Tue, 08 Mar 2011 20:19:09 GMT
On Tue, Mar 08, 2011 at 11:15:20AM -0800, David E. Wheeler wrote:
> I suggest bending C to Perl's namespacing rather than the other way around.

Python users have the option of aliasing the module:

    from Lucy.Analysis import RegexTokenizer as RegexTokenizer
    tokenizer =

So do Perl users, thanks to the 'aliased' module from CPAN:

    use aliased 'Lucy::Analysis::RegexTokenizer' => 'RegexTokenizer';
    my $tokenizer = RegexTokenizer->new;

> It doesn't take a lot of munging to get what you want:

Your demo of the transform is admirably compact.  However, the existing naming
scheme is fairly deeply baked in to our object system, and a good deal of the
code that touches on it is written in C.  Breaking the existing convention
would require a certain amount of work.

If someone is willing work up a patch which makes "Lucy::Tokenizer::Regex"
possible, then we can consider it.  Until then, it has to be ruled out for
technical reasons.

FWIW, "Lucy::Tokenizer::Regex" implies that we would have a Lucy::Tokenizer
class, which would break another convention -- we no longer have any classes
which live directly under Lucy.

Marvin Humphrey

View raw message