lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: [lucy-dev] New branch LUCY-215-cf-extensions
Date Sun, 25 Mar 2012 17:02:07 GMT
On Sat, Mar 24, 2012 at 4:37 AM, Nick Wellnhofer <> wrote:

> This is now implemented in the branch.

Woo hoo!

> Also done.

Woo hoo some more!!

> I used a flag ('is_included') and didn't go with the CFCFile pointer
> approach.

Would make more sense to use a flag for "is_tangible" instead of
"is_included"?  What if the user provides the same directory via both
add_include_dir() and add_source_dir()?


That would presumably set the "is_included" flag for any classes found in that
dir, and inappropriately prevent the generation of the parcel/boot/etc. C

> One thing is core/Lucy/Util/ToolSet.h, which is included from every
> Clownfish .c file. I'm not sure what's the best way to handle that.
> Then the whole autobinding stuff doesn't work correctly yet. I only had a
> quick glance at the relevant code, but it seems that some parts of Clownfish
> are still hardcoded to Lucy.

I see you've been working on this today and will likely solve the problem if
you haven't already.  I'll just provide some historical background on that file.

Lucy::Autobinding defines a bunch of package global hashes which are used for
parameter validation.  They also used to be used for assigning default values,
though the implementation has sinced changed.

Every method that uses labeled parameters works something like this:

    package Lucy::Foo;

    our %do_stuff_PARAMS = (
        foo => undef,
        bar => 1,

    sub do_stuff {
        my ( $self, %args ) = @_;

        # Ensure that no invalid params were supplied.
        for my $name ( keys %args ) {
            if (!exists $do_stuff_PARAMS{$name} ) {
                confess("Invalid param name: '$name'");

        # Assign default values.
        while ( ( my $name, $val ) = each %do_stuff_PARAMS ) {
            if ( !defined $args{name} ) {
                $args{$name} = $val;

        # Invoke core implementation.
        return _do_stuff( $self, @args{qw( foo bar )} );

Nowadays, the actual validation happens the function XSBind_allot_params().
Default values are no longer assigned via the PARAMS hashes, but are
hard-coded into the generated XS.

The fact that we use generated Perl in a file called "" rather
than generated XS to seed the PARAMS hashes is an implementation detail.  It
was just easier to write out Perl code.

It's surely possible to tweak the interface for XSBind_allot_params() so that
those PARAMS hashes are no longer necessary.  The names of all parameters are
now hard-coded in every auto-generated XS binding.  If nothing else, we could
just replace the PARAMS hash with a NULL-terminated char** argument:

     const lucy_CharBuf* normalization_form = NULL;
     chy_bool_t case_fold = true;
     chy_bool_t strip_accents = false;
     chy_bool_t args_ok = XSBind_allot_params(
-        &(ST(0)), 1, items, "Lucy::Analysis::Normalizer::new_PARAMS",
+        &(ST(0)), 1, items, ["normalization_form", "case_fold",
"strip_accents", NULL],
         ALLOT_OBJ(&normalization_form, "normalization_form", 18,
false, LUCY_CHARBUF, alloca(cfish_ZCB_size())),
         ALLOT_BOOL(&case_fold, "case_fold", 9, false),
         ALLOT_BOOL(&strip_accents, "strip_accents", 13, false),
     if (!args_ok) {

The NULL-terminated list of params is redundant since the param names
follow anyway, but it would be a little easier to implement based on how
XSBind_allot_params() works at the moment.

If we make that change, we are almost ready to junk  (The only
remaining task is to rework init_autobindings() so that it happens within the
"Lucy" package and gets invoked from within

Marvin Humphrey

View raw message