uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: Ruta conflicts with DKPro typesystem
Date Mon, 10 Apr 2017 10:25:55 GMT
Hi,


there are two options to avoid ambiguous references to types by using
their shot name.


This first one is using an alias as you did. However, you have to assign
an unambiguous alias. Ruta should check if the alias is ambiguous but
obviously doesn't. Try something like:

IMPORT org.apache.uima.ruta.type.NUM FROM
org.apache.uima.ruta.engine.BasicTypeSystem AS RutaNum;

Then you can use "RutaNum" for referencing to
org.apache.uima.ruta.type.NUM in your rules.


... or something like IMPORT PACKAGE org.apache.uima.ruta.type FROM
org.apache.uima.ruta.engine.BasicTypeSystem AS ruta;

... then you should be able to use ruta.NUM in your rules.


(I did not test both examples)


The second option is to activate the "strictImports" configuration
parameter. If activated, the type expressions, e.g., by short name, are
only resolved against the types that are imported. Thus, if you do not
import the DKPro Core type system, the NUM of the ruta type system will
be used. If deactivated, the references are resolved against the names
in the type system of the given CAS. If you create the CAS with uimaFIT,
then there are also types that are not imported in you script. Well, you
would not even need to import the types in order to use them in your script.


Both options have their advantages and disadvantages. Using strictImport
in generic scripts where you initialize type variables using
configuration parameters is problematic. If you have a larger pipeline
with unknown components with unknown type systems, strictImports is
often required. There may be a conflict with other components, which
cannot be known when writing the rules.


btw, there is also an updated exemplary project using DKPro Core in ruta:

https://github.com/pkluegl/ruta/tree/master/ruta-german-novel-with-dkpro



Let me know if this helps or if I should provide more information.


Best,


Peter




Am 07.04.2017 um 15:01 schrieb Hugues de Mazancourt:
> Hi,
>
> I’m using Ruta to perform information extraction and I mix it in a pipeline with DKPro-based
resources (for POS-tagging and NER). Thus, I have my own type system, Ruta’s basic type
system and some DKpro typesystems (especially the one describing Tokens)
>
> I end up with type conflicts such as (Ruta error) :
>
>> java.lang.IllegalArgumentException: NUM is ambiguous, use one of the following instead
: de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.NUM org.apache.uima.ruta.type.NUM

> I tried to use declarations such as :
>
>> IMPORT org.apache.uima.ruta.type.NUM FROM org.apache.uima.ruta.engine.BasicTypeSystem
AS NUM;
> at the top of my Ruta rule files, but this doesn’t help.
>
> I guess using « org.apache.uima.ruta.type.NUM » instead of « NUM » would fix the
problem, but this wouldn’t increase readability of rules !
> The other solution I see would be to create my own, non-ambiguous, readable annotation
and have a rule that marks all org.apache.uima.ruta.type.NUM with that annotation, but I’m
afraid of performance issues due to these redundant annotations.
>
> Is there any other solution for Ruta to mask some types or alias them ?
>
> Best,
>
> Hugues de Mazancourt
> http://about.me/mazancourt
>
>
>
>
>


Mime
View raw message