uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Sominsky" <somin...@gmail.com>
Subject Re: [VOTE] accept the Configurable Feature Extractor (CFE) into the sandbox
Date Wed, 22 Oct 2008 16:51:59 GMT
>Igor Sominsky wrote:
>> Thilo,
>> As Michael wrote, we will continue to support CFE in the future. It is 
>> extensively used in our
>> research and development process. In fact, CFE is the only software that 
>> we use for extracting
>> features for machine learning and evaluation. We are also extending CFE 
>> to include an evaluation
>> as a part of its functionality. For that purpose we are enabling FESL 
>> with semantics required to
>> specify rules for annotation comparison. Just to clarify, you observation 
>> is correct, as of now
>> an evaluation is a separate piece of functionality and is not a part of 
>> CFE, although it works
>> off CFE's output. But as I pointed out earlier, it is being integrated 
>> with CFE

>I was wondering if it's really a good idea to integrate it. It seems 
>orthogonal to the feature
>extraction part.

The evaluation functionality is going to be built on top of the feature 
extraction. The feature extraction is going to be just a step during the 
evaluation process. That is why extending semantics for comparison seemed to 
make sense. CFE can still be used just for a feature extraction.

>> Regarding EMF-based and XMLBeans parsers. Both parsers are generated from 
>> the FESL schema file. I
>> do not see any problem in eliminating one parser or the other. I 
>> personally prefer to work with
>> EMF-based parser as it is integrated with Eclipse and allows quick 
>> turn-around cycle in the case
>> it is required to make changes to the schema.
>The EMF stuff would make a lot of sense if you planned any Eclipse-based 
>tooling. Outside Eclipse,
>EMF-based XML parsing is difficult to get right, in my experience (because 
>it really wasn't intended
>to be used outside Eclipse). Maybe that's no longer the case, though. I 
>haven't tried it in a long

I am invoking the feature extraction form Perl scripts. No issues. Setting the
class path correctly does the job. There are only 5 jars from EMF that CFE 
is dependant on

>So, I'll vote +0 on this one. If the other UIMA committers are happy to 
>pick it up, and do the work
>to integrate it into the sandbox, whip the documentation into shape, maybe 
>simplify the user-facing
>xml, I won't stop you. I really think the world needs a tool like this, 
>just not convinced this is
>the best approach. Mind you, maybe there is no simpler way. But if there 
>isn't, then I for one
>still prefer a few lines of Java I've written myself over a couple of pages 
>of xml. No offense.

Well, it is a matter of a personal preference, but going further, when 
developing a GUI tool, generating XML files seems to be a better option then 
generating Java code. Again, it is a personal preference.


> From: Thilo Goetz <twgoetz@gmx.de> To: uima-dev@incubator.apache.org Date: 
> 10/22/2008 05:36 AM
> Subject: Re: [VOTE] accept the Configurable Feature Extractor (CFE) into 
> the sandbox
> --------------------------------------------------------------------------------
> Michael Tanenblatt wrote:
>> Igor Sominsky is on vacation now, so he cannot respond, but I think it is 
>> safe to say that he
>> will continue to support this for the foreseeable future. It is something 
>> that he and I use
>> often, and he has been continuing to enhance and support it.
>> On Sep 29, 2008, at 9:51 AM, Thilo Goetz wrote:
>>> Who will maintain this code once it's in the sandbox?
> Igor, are you back? Care to comment? I'd like to see some assurance that 
> this is not just a
> code drop before I vote.
> I'm also confused about the evaluation part of CFE. I can see that it's 
> useful to have this sort
> of evaluation, but should it really be part of a feature extraction 
> package? It seems a pretty
> independent sort of functionality. Or maybe I just didn't understand it.
> You write in your user's guide that CFE depends on both XmlBeans and EMF. 
> Are you using EMF for
> anything but XML processing? Do you think EMF could be eliminated and 
> completely replaced by
> XmlBeans? To be clear, this is not necessary, I'm just curious.
> --Thilo
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message