uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igor Sominsky" <somin...@gmail.com>
Subject Re: annotator based on regular expressions over (previous) annotations: state-of-work in UIMA?
Date Sat, 17 Jan 2009 01:04:40 GMT
Armando,

As now I understand your goals better, you are right on all of the point 
that you have made. Only the feature VALUES can be evaluated/transformed 
with regular expressions. The overall search criteria must be explicitly 
specified using FESL tags. I like the idea of using regexps for the search 
very much, just not sure about a complexity level of the implementation, 
although I might be completely wrong overestimating it.

Please let me know if you need any other information on CFE or would like to 
discuss it

Thanks
Igor

----- Original Message ----- 
From: "Armando Stellato" <stellato@info.uniroma2.it>
To: <uima-user@incubator.apache.org>
Sent: Friday, January 16, 2009 7:16 PM
Subject: R: annotator based on regular expressions over (previous) 
annotations: state-of-work in UIMA?


Hi Igor,

thanks for the pointer. I've done a brief run under your LREC paper:

http://domino.research.ibm.com/comm/research_projects.nsf/pages/medicalinformatics.pubs.html/$FILE/CFE_sominsky-A4.pdf

and a presentation I found on the Web:

http://watchtower.coling.uni-jena.de/~coling/uimaws_lrec2008/slides/sominsky_20080531_talk_CFE.pdf

At a first glance, it seemed something quite different from what I needed. 
FESL is a (I hope not to abuse the term :-) ) trasformator from UIMA 
features. The target may be new UIMA features or other kind of data (as for 
the title of the paper and the example of figure 3, which suggests its use 
in Machine Learning, by extracting useful info from the existing 
annotations, which can feed a learner). However, I tried to understand it 
better, because it could anyway have the power to do what I was looking for, 
which is to apply regular expressions over the content of a document, with 
elements of the expressions being not only represented by strings, digits 
etc.. but also by Annotation types. Like (with a very simple syntax) telling 
that:
.* {<PersonTitle> <Name>}
will extract a new Annotation called Person when matching the (previously 
annotated with PersonTitle and Name annotations) string: "Mr John Doe"
Lastly, I think I found the problem: in the paper you mention Reg Exps as 
one of the 5 filters which can be applied to evaluate values (upper right 
part of page 3 of the paper), but the overall search mechanism (points from 
a) to f) upper LEFT part of page 3) is not based on regular expressions nor, 
I think, has their power (though I will delve into the details of point f) 
with further reading).

On the basis of what I got from the reading, I think it is not what I need, 
though it could surely be included as part of it. For example (again simple 
syntax):

.* {<Person>} "salary" <Currency>:normalizedvalue > 300000€

To extract instances of RichPerson

If I missed some crucial aspect, please let me know,

Thanks in advance,

Armando Stellato


> -----Messaggio originale-----
> Da: Igor Sominsky [mailto:sominsky@gmail.com]
> Inviato: venerdì 16 gennaio 2009 22.59
> A: uima-user@incubator.apache.org
> Oggetto: Re: annotator based on regular expressions over (previous)
> annotations: state-of-work in UIMA?
>
> Armando,
>
> In posted version of CFE you can alter the value of an extracted feature 
> by
> applying a Java regular expression. The code that is currently under
> development would allow to combine several values by using Java regular
> expressions or math expressions. The grammar of math expressions include
> capability for using java functions and constants (through reflection)
>
> I hope that answers your question. Please let me know if you need more
> information
>
> Thank
> Igor
>
>
> ----- Original Message -----
> From: "Armando Stellato" <stellato@info.uniroma2.it>
> To: "UIMA" <uima-user@incubator.apache.org>
> Sent: Friday, January 16, 2009 1:19 PM
> Subject: annotator based on regular expressions over (previous) 
> annotations:
> state-of-work in UIMA?
>
>
> > Hi all,
> >
> >
> >
> > From a few posts, like the one at the following link:
> >
> >
> >
> > http://osdir.com/ml/apache.uima.general/2008-05/msg00070.html
> >
> >
> >
> > it seems that there is some interest in seeing such kind of processor in
> > the
> > UIMA array of available components.
> >
> >
> >
> > Since we're considering working on developing a new one, but would 
> > prefer
> > not to reinvent the wheel J, I'm asking if there is already someone 
> > doing
> > the same and, in case, get pointers to their work, know if it is
> > available,
> > if it's still in work-in-progress etc.
> >
> >
> >
> > Best regards,
> >
> >
> >
> > Armando Stellato
> >
> >
> >
> > --------------------------------------------------
> >
> >
> >
> > Ing. Armando Stellato, PhD
> >
> > AI Research Group,
> >
> > Dept. of Computer Science, Systems and Production
> >
> > University of Roma, Tor Vergata
> >
> > Via del Politecnico 1 00133 ROMA (ITALY)
> >
> > tel: +39 06 7259 7330 (office, room A1-14);
> >
> >     +39 06 7259 7332 (lab)
> >
> > fax: +39 06 7259 7460
> >
> > e_mail: stellato@info.uniroma2.it
> >
> > yahoo: stellato75
> >
> > jabber(gtalk): stellato75@gmail.com <mailto:starred75@gmail.com>
> >
> > skype: odnamar
> >
> >
> >
> > --------------------------------------------------
> >
> >
> >
> >


Mime
View raw message