opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Kosin <>
Subject Re: Custom feature generators
Date Tue, 14 Jun 2011 02:32:36 GMT
On 6/13/2011 10:23 PM, wrote:
> Hi,
> Currently we only have implemented custom feature generators that we can
> pass from command line only for NameFinder, but it would be very nice to
> have it for all tools.
> The Thai sentence detector customization is nice and simple, but to do
> something for other languages the user would need to branch the code. We
> should allow users to pass a factory class name from command line. Maybe we
> could do it for every tool that doesn't use sequence feature generator. Also
> would be nice to save the factory class name to the model to make sure we
> are using the same feature generator during runtime and evaluation.
> What do you think? Maybe you have thought a better solution for that.
> Thanks
> William

We discussed various options, unfortunately, most involved some security
risk for the Java engine; including allowing the saving of the actual
feature generator constructor itself to the model.  Maybe the XML option
may be a better route for the long run.  We could even save the copy of
the XML document in the model itself.  But again that opens us up for
issues if someone writes bad XML to cause issues.

Maybe, we could have the feature generator a generic class that needed a
constructor.  Then each implementing language could have a new
constructor that correctly built the feature generator.  Unfortunately,
it means a change would break any models.

We may need to re-open the issue when Jorn comes back or at least get
another discussion going so we can try and weed out the issues with the
options available.


View raw message