uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: [VOTE] accept the Configurable Feature Extractor (CFE) into the sandbox
Date Wed, 22 Oct 2008 15:51:14 GMT
Igor Sominsky wrote:
> Thilo,
> As Michael wrote, we will continue to support CFE in the future. It is extensively used
in our 
> research and development process. In fact, CFE is the only software that we use for extracting

> features for machine learning  and evaluation. We are also extending CFE to include an
>  as a part of its functionality. For that purpose we are enabling FESL with semantics
required to
>  specify rules for annotation comparison. Just to clarify, you observation is correct,
as of now 
> an evaluation is a separate piece of functionality and is not a part of CFE, although
it works 
> off CFE's output. But as I pointed out earlier, it is being integrated with CFE

I was wondering if it's really a good idea to integrate it. It seems orthogonal to the feature
extraction part.

> Regarding EMF-based and XMLBeans parsers. Both parsers are generated from the FESL schema
file. I
>  do not see any problem in eliminating one parser or the other. I personally prefer to
work with 
> EMF-based parser as it is integrated with Eclipse and allows quick turn-around cycle
in the case 
> it is required to make changes to the schema.

The EMF stuff would make a lot of sense if you planned any Eclipse-based tooling. Outside
EMF-based XML parsing is difficult to get right, in my experience (because it really wasn't
to be used outside Eclipse).  Maybe that's no longer the case, though.  I haven't tried it
in a long 

So, I'll vote +0 on this one.  If the other UIMA committers are happy to pick it up, and do
the work 
to integrate it into the sandbox, whip the documentation into shape, maybe simplify the user-facing

xml, I won't stop you.  I really think the world needs a tool like this, just not convinced
this is 
the best approach.  Mind you, maybe there is no simpler way.  But if there isn't, then I for
still prefer a few lines of Java I've written myself over a couple of pages of xml.  No offense.


> From:  Thilo Goetz <twgoetz@gmx.de> To:  uima-dev@incubator.apache.org Date:  10/22/2008
05:36 AM
>  Subject:  Re: [VOTE] accept the Configurable Feature Extractor (CFE) into the sandbox
> --------------------------------------------------------------------------------
> Michael Tanenblatt wrote:
>> Igor Sominsky is on vacation now, so he cannot respond, but I think it is safe to
say that he 
>> will continue to support this for the foreseeable future. It is something that he
and I use 
>> often, and he has been continuing to enhance and support it.
>> On Sep 29, 2008, at 9:51 AM, Thilo Goetz wrote:
>>> Who will maintain this code once it's in the sandbox?
> Igor, are you back?  Care to comment?  I'd like to see some assurance that this is not
just a 
> code drop before I vote.
> I'm also confused about the evaluation part of CFE.  I can see that it's useful to have
this sort
>  of evaluation, but should it really be part of a feature extraction package?  It seems
a pretty 
> independent sort of functionality.  Or maybe I just didn't understand it.
> You write in your user's guide that CFE depends on both XmlBeans and EMF.  Are you using
EMF for 
> anything but XML processing?  Do you think EMF could be eliminated and completely replaced
> XmlBeans?  To be clear, this is not necessary, I'm just curious.
> --Thilo

View raw message