uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikolai Krot <tal...@gmail.com>
Subject Re: how to set scriptPath
Date Thu, 18 Apr 2019 11:58:03 GMT
another interesting discovery concerning importing several scripts:
all statements ENGINE and SCRIPT should *preceed* any CALL command, type
declarations, normal rules, etc, like this:

// file: script/uima/ruta/eng/aaa.ruta
PACKAGE uima.ruta.eng;

// all imports
ENGINE uima.ruta.eng.common.dateEngine;
SCRIPT uima.ruta.eng.common.date;
ENGINE uima.ruta.eng.common.xxxEngine;
SCRIPT uima.ruta.eng.common.xxx;

// and now run annotation
CALL(date);
CALL(xxx);

as opposed to the incorrect:

// file: script/uima/ruta/eng/aaa.ruta
PACKAGE uima.ruta.eng;

ENGINE uima.ruta.eng.common.dateEngine;
SCRIPT uima.ruta.eng.common.date;
CALL(date);

ENGINE uima.ruta.eng.common.xxxEngine; // <-- an error is reported here
SCRIPT uima.ruta.eng.common.xxx;
CALL(xxx);

Sorry for spamming.

BR, Nikolai

On Thu, Apr 18, 2019 at 12:40 PM Nikolai Krot <talpus@gmail.com> wrote:

> Hi,
>
> Great day! Finally, after many experiments I managed to reuse another
> script that is not in the same directory and the current ruta script.
>
> It turns out to be very easy to achieve:
>
> // file: script/uima/ruta/eng/aaa.ruta
> PACKAGE uima.ruta.eng;
>
> // these two lines import another script form a subdirectory
> ENGINE uima.ruta.eng.common.dateEngine;
> SCRIPT uima.ruta.eng.common.date;
>
> // and either of these two performs annotation with the above imported
> script
> CALL(date);
> //EXEC(date); // this also works
>
> I am not sure this is the right way to accomplish the goal, but still I
> want to leave the answer here for the record, as it turned out to be very
> time consuming to find the answer. Hopefully, the answer will save someone
> else's time :)
>
> Happy Easter!
>
> BR, Nikolai
>
>
> On Tue, Apr 16, 2019 at 11:14 AM Nikolai Krot <talpus@gmail.com> wrote:
>
>> Hi Peter,
>>
>> Thank you for your quick reply. Still can not get it to work. Please read
>> my questions intertwined with your answers.
>>
>> I will start with an overview of the projects structure I have set up
>>
>> script/uima/ruta/deu/common/date.ruta
>> script/uima/ruta/eng/common/date.ruta
>> script/uima/ruta/eng/aaa.ruta
>>
>> So, the project is multilingual (hence deu and eng) and I want to keep
>> some shared scripts under *common/* directory. An example of such a
>> shared script is, for example, *date.ruta*, for recognizing dates,
>> because dates can appear in any type of document.
>> All specific stuff, that is my case is document-type/genre specific, goes
>> outside of *common/* directory: *eng/aaa.ruta* is an example of such
>> specific document genre-dependent script.
>>
>> And now I want to reuse *eng/common/date.ruta* in said *eng/aaa.ruta*
>> that looks like this:
>>
>> // file: script/uima/ruta/eng/aaa.ruta
>> PACKAGE uima.ruta.eng;
>> SCRIPT uima.ruta.eng.common.date;  <-- this causes an error
>> Document {->CALL(date)};
>>
>> I tried to do it like this but it did not work. Hence i am asking this
>> question.
>>
>> On Tue, Apr 16, 2019 at 9:27 AM Peter Klügl <peter.kluegl@averbis.com>
>> wrote:
>>
>> I assume that you use a simple Ruta project (compared to a maven project
>>> with the ruta-maven-plugin)?
>>>
>>
>> true.
>>
>>
>>>
>>> Normally, you should not need to set the scriptPaths configuration
>>> parameter as it is set automatically by the builder to the absolute
>>> paths to the "script" folder of your ruta project.
>>>
>>> The file "descriptor/BasicEngine.xml" is only the template for the
>>> generated descriptors for your Ruta scripts. Thus, is isn't used for
>>> launching scripts.
>>>
>>
>> Do you mean that *descriptor/BaseEngine.xml* is used as a basis for
>> other descriptor/*Engine.xml files? that is, the file
>> *descriptor/uima/ruta/eng/aaaEngine.xml* was generated from
>> BasicEngine.xml when I first created *eng/aaa.ruta* file. Can the file
>> aaaEngine.xml be edited manually? Is it safe? I see that this file contains
>> absolute paths and what happens when I move this project to another
>> computer that has a different directory structure? And finally, should
>> aaaEngine.xml be committed to a git repository?
>>
>>
>>> Can you check the parameter value of the descriptor of the Ruta script
>>> you want to run?
>>>
>>
>> sorry, can not understand this. How do I locate it?
>>
>>
>>> If you want to use an additional script, you can simply import it in
>>> your script (using the correct package with "SCRIPT" and then execute it
>>> with "CALL"). The Workbench should take care of all the configuration.
>>>
>>
>> Unfortunately, it did not work. The error is
>>
>> Exception in thread "main"
>> org.apache.uima.resource.ResourceInitializationException: Initialization of
>> annotator class "org.apache.uima.ruta.engine.RutaEngine" failed.
>> (Descriptor: file:/path/to/zzz/descriptor/uima/ruta/eng/aaaEngine.xml)
>> ...
>> Caused by: org.apache.uima.ruta.extensions.RutaParseRuntimeException:
>> Error in aaa, line 11, "SCRIPT": expected 'none', but found ScriptString
>>
>>
>>>
>>> In any case this works not as expected, here are the two configuration
>>> parameter values you would need to set:
>>>
>>>             <nameValuePair>
>>>                 <name>scriptPaths</name>
>>>                 <value>
>>>                     <array>
>>>                         <string>C:/src/ws/ws-ta/ARutaTest/script</string>
>>>                     </array>
>>>                 </value>
>>>             </nameValuePair>
>>>
>>>
>>>             <nameValuePair>
>>>                 <name>additionalScripts</name>
>>>                 <value>
>>>                     <array>
>>>                         <string>uima.example.Test2</string>
>>>                     </array>
>>>                 </value>
>>>             </nameValuePair>
>>>
>>>
>>>
>> Oke, I tried the following (in my case, this is said *aaaEngine.xml*
>> file, correct?) but looks like it did not help: still getting the same
>> error as above when I run the script:
>>
>>             <nameValuePair>
>>                 <name>additionalScripts</name>
>>                 <value>
>>                     <array>
>>                         <string>uima.ruta.eng.common.date</string>
>>                     </array>
>>                 </value>
>>             </nameValuePair>
>>
>> I also tried using CONFIGURE command in aaa.ruta script, but honestly, I
>> dont know how to specify an array of strings as a value and the
>> documentation does not cover this case
>>
>> CONFIGURE(hmlrEngine, "scriptPaths" = ["uima/ruta/eng/common"]); <--
>> invalid syntax
>>
>> Thank you and best regards,
>> Nikolai
>>
>>
>>
>>>
>>>
>>> Am 15.04.2019 um 22:46 schrieb Nikolai Krot:
>>> > Hi,
>>> >
>>> > In a ruta rule script, I want to include and run another script that is
>>> > located in a subdirectory w.r.t to the location of the current script.
>>> From
>>> > reading the documentation, it seems that the variable *scriptPath*
>>> needs to
>>> > be set to that subdirectory for *SCRIPT* directive to work. Or am I
>>> wrong?
>>> >
>>> > I can not figure out how to set *scriptPath*. One possibility would be
>>> to
>>> > add this configuration to *descriptor/BasicEngine.xml* file.
>>> Unfortunately,
>>> > I can not find any example of how to accomplish it. Should it be like
>>> below?
>>> >
>>> > <nameValuePair>
>>> >>   <name>scriptPath</name>
>>> >>    <value>
>>> >>         <string>path/to/subdirectory</string>
>>> >>    </value>
>>> >> </nameValuePair>
>>> >>
>>> > What if I want to set several path values?
>>> > Do values need to be relative to the project root directory?
>>> >
>>> > Thank you in advance,
>>> > Nikolai
>>> >
>>> --
>>> Dr. Peter Klügl
>>> R&D Text Mining/Machine Learning
>>>
>>> Averbis GmbH
>>> Salzstr. 15
>>> 79098 Freiburg
>>> Germany
>>>
>>> Fon: +49 761 708 394 0
>>> Fax: +49 761 708 394 10
>>> Email: peter.kluegl@averbis.com
>>> Web: https://averbis.com
>>>
>>> Headquarters: Freiburg im Breisgau
>>> Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
>>> Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó
>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message