uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schaaf <sebastian.sch...@scai.fraunhofer.de>
Subject Re: RUTA in Java: access object contents
Date Thu, 29 Sep 2016 08:30:12 GMT
Hi Peter,

As yesterday the RUTA 2.5 release was announced (congrats :) ) and the end of the month is
near: do you see chances to work on the ticket for feature structure support?

Best,
Sebastian


----- Ursprüngliche Mail -----
Von: "Sebastian Schaaf" <sebastian.schaaf@scai.fraunhofer.de>
An: "user" <user@uima.apache.org>
Gesendet: Mittwoch, 14. September 2016 22:08:39
Betreff: Re: RUTA in Java: access object contents

Sounds great, thank you in advance!
Whatever comes up, don't hesitate to query back to us.

Cheers,
Sebastian


----- Ursprüngliche Mail -----
Von: "Peter Klügl" <peter.kluegl@averbis.com>
An: "user" <user@uima.apache.org>
Gesendet: Mittwoch, 14. September 2016 18:33:56
Betreff: Re: RUTA in Java: access object contents

Hi,


I created an issue for it: https://issues.apache.org/jira/browse/UIMA-5108


I won't be able to fix it this week, and maybe not next week because of
some deadlines. I guess it will be fixed at least in the trunk before of
the end of the month.



Best,


Peter


Am 14.09.2016 um 18:25 schrieb Sebastian Schaaf:
> Hi Peter,
>
> Indeed, I was talking about UIMA objects.
>
> We tried to hunt down the error in deeper means and understood more of the codes. Ahead
of any details: again yes, we fail on types extending from TOP. In our case it is "concept",
which does not have a covering text.
>
> In "SimpleFeatureEx", the public method "getFeatures" contains a for loop in which the
different handleable cases are listed in some 'if else' cascade - this one also contains the
support for arrays you wrote about. For our concept type the else case holds, so that an UIMA
method "getFeatureByBaseName" gets called. This one fails, because it checks if the extracted
feature comes from a type that extends from the type we want to use the feature content for.
In other words: our NormalizedNamedEntity type (extending from Annotation) is queried for
a feature contained in an instance of type concept. As the latter one extends from TOP (and
not from NormalizedNamedEntity) getFeatureByBaseName throws the error. Although the desired
content is fine (we get the string we want!).
>
> We also tried to manipulate types, temporarily declaring concept extends legally, so
that this check does not fail. And it is fine. For the moment, because regarding our environment
this is not an option. Testing with ruta source codes to implement ourselves resulted in many
lines of code to be subject to adaptation. Also, the variable 'result' in the discussed for
loop may be changed in an inadequate way . we don't know about the details of RUTA.
>
> So, the question is may it be possible for you to implement the handling of cases where
features extend from TOP? Maybe first as a patch, so that it has not to be integrated into
your release. And we could test whether it fails in our setting.
>
> So far,
> Best,
>
> Sebastian
>
> ----- Ursprüngliche Mail -----
> Von: "Peter Klügl" <peter.kluegl@averbis.com>
> An: "user" <user@uima.apache.org>
> Gesendet: Montag, 12. September 2016 13:26:07
> Betreff: Re: RUTA in Java: access object contents
>
> Hi,
>
>
> first of all: what do you mean exactly by "our objects" and "given Java
> objects"? Real Java objects of some arbitrary class or feature
> structures (annotations) in UIMA? I assume that you were referring to
> UIMA objects and the getters are the getters of features in JCasGen
> cover classes. If not, you can skip the answer below ;-)
>
>
> What you describe that should work just fine, if there weren't the
> feature structures (the types extending TOP). Plain feature structures
> are hardly supported in Ruta mainly for historical reasons. And many
> language elements do not make much sense without annotation offsets,
> e.g., sequential matching, conditions like contains and partof, ...
>
>
> There is no real technical reason that feature structures are not
> completely supported, there was just no reason to support them. I
> personally just extended Annotation instead of Feature Structure even if
> there was no explicit semantics of the offsets. This is of course not an
> option if you already have a type system.
>
>
> I actually have to admit that I do not know right now where feature
> structures are and are not supported in Ruta. I added some minimal
> support for Arrays lately, and they are also just feature structures. I
> have to take a look...
>
>
> Back to your example:
>
> If you have
>
> - Type X extends Annotation with feature a with range A
>
> - Type A extends TOP with feature b with range B
>
> - Type B extends TOP with feature z with range String
>
> ... you would normally write:
>
> X.a.b.z=="z";
>
> to match on each annotation of type X, get the value of feature a of
> annotation X, get the value of feature b of the feature structure  of
> type A, get the value of feature z of the feature structure of the type
> B, and compare it to the string "z".
>
>
> The short answer is that you can access the getter just with the name of
> the feature.
>
>
> If this simple example does not work, then the reason is probably a
> simple instanceof comparing the feature structure to AnnotationFS.
> Allowing feature structures in feature expression only should not be
> much work.
>
>
> Do you want me to add this support in Ruta? However, I cannot promise
> that the changes will be part of the upcoming release.
>
>
> Best,
>
>
> Peter
>
>
> Am 12.09.2016 um 09:42 schrieb Sebastian Schaaf:
>> Dear all,
>>
>> As we needed to integrate a rule-based analysis engine into our UIMA 
>> framework, we ended up using RUTA. The package was encouraging, we 
>> proceeded well with projecting our ideas into RUTA (thanks to the 
>> comprehensive documentation).
>>
>> We also saw that there are efforts to offer RUTA in plain Java code 
>> for developers, ignoring the delivered workbench. We could integrate 
>> it well with our modified type system, it is finally running. But, 
>> and that's the reason for this email, currently we are stuck with 
>> extracting some information from our objects, which is not 
>> represented as simple feature. Leaving out the option to introduce 
>> major changes to our codes and not liking the idea of permanent 
>> workarounds, we were wondering if (and if not maybe when) there is 
>> the possibility to generically call methods on given Java objects.
>> Precisely, we have objects with attributes being (linked from other) 
>> objects, plus respective getter methods. So, the information we need 
>> from our objects is retrievable by calling a getter X.getA, resulting 
>> in (background) object A which in turn knows a method .getB, 
>> resulting in the desired B (or more precisely: its string Z):
>>
>> ### Example ###
>> Type X (extends Annotation, has offsets)
>> Type A (extends TOP, has no offsets)
>> Type B (extends TOP, has no offsets)
>> String Z
>>
>> How to call "X.getA().getB().getZ()"?
>> ##############
>>
>> It appeared that RUTA is capable by some whatever (UIMA-?)magic to 
>> get e.g. the covered text of a text annotation by "X.coveredText", 
>> although the object only knows a "getCoveredText" method. Let's call 
>> it a 'pseudo-feature'. No idea how generic this is, but: if just 
>> querying "X.A" RUTA seems to do well, ultimately receiving A. While A 
>> is an object (simple data type expected, like integer and string?) 
>> everything stops. So no obvious chance to receive our B.
>>
>> Is there an easy, somewhat 'native' way to deal with object-derived 
>> data like in the case described above?
>>
>> Thanks in advance!
>>
>> Sebastian
>>
>>
>> ---
>> Sebastian Schaaf, M.Sc. Bioinformatics
>>
>> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
>> Department of Bioinformatics
>> Schloss Birlinghoven
>> D-53754 Sankt Augustin
>>
>> Room: C3-233
>> Tel.: +49 2241 14 2280
>> Email: sebastian.schaaf@scai.fraunhofer.de
>> Internet: http://www.scai.fraunhofer.de/
>>

Mime
View raw message