uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: Marking cosnecutive tokens with RUTA
Date Wed, 17 Jun 2015 19:20:23 GMT
Hi,

UIMA Ruta 2.3.0 and also the maven plugin require Java 7. Thus, the 
maven build process has to use the correct Java version. Just wanted to 
mention it because I had this problem right away.

The descriptors are not built because the plugin does not find any ruta 
files. The maven plugin is specified in one project while the ruta files 
are located in a different project. The problem is that the ruta maven 
plugin only collects ruta files within the basedir of the project -> no 
files built...

In the next release, the maven plugin will get another parameter for 
specifying the input files.

With UIMA Ruta 2.3.0, there are two options: Either you put the ruta 
files in the project with the ruta maven plugin, or you add the ruta 
maven plugin to the project pom with the ruta files.

Best,

Peter

Am 17.06.2015 um 18:30 schrieb Diego Buoro:
> Hi, Peter! We are attempting to create the descriptors based on Ruta 2.3,
> but we're out of luck. We've added the lines from the link you gave us to
> the pom.xml file and corrected the directory paths to suit our project.
> However, when we try to run Maven with Ruta's "generate" goal, no files got
> generated on the folders we set. Is the goal supposed to generate the files
> and leave them in the folder or does it do something else?
>
> Here is the link to our altered pom.xml. The plugin section is at the end
> of the file:
> https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml
>
> Thanks for the help so far. :D
>
> 2015-06-14 9:40 GMT-03:00 Peter Klügl <peter.kluegl@averbis.com>:
>
>> Hi,
>>
>> the descriptor are always created at compile time.
>>
>> In Ruta 2.2.1, yes, you need to create the descriptors in the UIMA Ruta
>> Workbench and then copy them or make them available in some other way. This
>> is especially necessary if you declare additional types (type system
>> descriptor changes) or add some subscript (analysis engine descriptor
>> changes).
>>
>> In Ruta 2.3.0 which was just released, there is a maven plugin for
>> building the descriptors. Take a look at:
>> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
>> This means that you do not need the UIMA Ruta Workbench projects anymore,
>> but you can use its development support and descriptor building in normal
>> maven projects.
>>
>> Best,
>>
>> Peter
>>
>>
>> Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>>
>>> Hello Peter
>>>
>>> We tried your suggestions and it worked liked a charm,thanks :D
>>> However, we are facing another problem: It seems that our application
>>> isn't
>>> creating the mainTypesystem and mainEngine files when we launch it. We
>>> don't know whether or not that's is the default behavior, but for now we
>>> are having to create these files in separate project and them copy them to
>>> the application whenever we change the script, which is a bad solution.
>>> Doy you have any suggestions?
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>> 2015-06-12 9:19 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>
>>>   Hi Peter, Armin
>>>> Thanks for the observations made, i hope we can finally get working here.
>>>> We will try the changes in the next few days and then give you a
>>>> feedback.
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>>
>>>>
>>>> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>
>>>>   Hi Peter, the example we used is the small sentence inside a string at
>>>>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>>>>> Based on the Main.ruta we sent you, we expected the output to contain
7
>>>>> "PROBLEM" annotations. This part is working.
>>>>> The problem is when we change the last line of Main.ruta from
>>>>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case
we
>>>>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>>>>> example, excpect for the first one.That's what happens when you run the
>>>>> script on a simple Ruta project, but when we run it in the  Java
>>>>> application we get 0 "PROBLEM" annotations.
>>>>> We think this difference is happening because in the Ruta project we
>>>>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>>>>> file. On the other hand on the Java application, we do the processing
>>>>> ourselves via the processCas method. It's possible that the processCas
>>>>> method is creating tokens in a way that prevents us from detecting when
>>>>> one
>>>>> is next to the other on the Ruta script.
>>>>> We are sending you the xmi file to use as an example for a simple Ruta
>>>>> project. If there are any other examples you'd like us to send you, just
>>>>> say the word :D
>>>>>
>>>>> Best,
>>>>>
>>>>> Diego
>>>>>
>>>>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>
>>>>>   Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>>>>> we are still thinking in a minimal example to provide to you.
>>>>>> We will send you in the next few days.
>>>>>>
>>>>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>
>>>>>>   Hi Peter,how are you doing?
>>>>>>> We were trying to run using the files such as Crase01.xmi and
>>>>>>> rule_xml_001.xmi.
>>>>>>> Our goal is trying to run those two more simpler first,and then
run
>>>>>>> with Crase.xmi.
>>>>>>>
>>>>>>> About the package declaration, i still need to check what ruta
version
>>>>>>> is.
>>>>>>> I will be checking this soon.
>>>>>>>
>>>>>>> All Best,
>>>>>>>
>>>>>>> Diego
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>
>>>>>>>   Hi Peter!
>>>>>>>> No problem, I appreciate your support.
>>>>>>>>
>>>>>>>> All Best,
>>>>>>>>
>>>>>>>> Diego
>>>>>>>>
>>>>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>>
>>>>>>>>   Hi Peter!
>>>>>>>>> We call the script with the following lines:
>>>>>>>>>
>>>>>>>>>    URL url = Resources.getResource("Main.ruta");
>>>>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>>>>    AnalysisEngineDescription aeDes =
>>>>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>>>>
>>>>>>>>> CAS cas = ae.newCAS();
>>>>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>>>>    ae.process(cas);
>>>>>>>>>
>>>>>>>>> The populateCAS method is responsible for translating
our
>>>>>>>>> annotations
>>>>>>>>> into RUTA annotations, but it doesn't set any type priority
>>>>>>>>> explicitly.
>>>>>>>>> We don't know much about type priorities, the RUTA references
we
>>>>>>>>> found say very little about that.Are they necessary for
doing what
>>>>>>>>> we need?
>>>>>>>>>
>>>>>>>>> The file that contains the above lines is available here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>>>>> The processCAS mehtod is available here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>>>>> The script we are calling is available here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>>>>
>>>>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>>>>
>>>>>>>>> Thanks for the help :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>>>
>>>>>>>>>   I think i wasn't clear enough, and i should be more
specific.
>>>>>>>>>> I have a type system in which all words have been
annotated as
>>>>>>>>>> Tokens. I am calling a RUTA script from a java class,
and that
>>>>>>>>>> script has
>>>>>>>>>> only one rule:
>>>>>>>>>> Token Token {-> Problem}
>>>>>>>>>>
>>>>>>>>>> However, with this script, no Problems are created.
When I try
>>>>>>>>>> Token {-> Problem}
>>>>>>>>>>
>>>>>>>>>> I get one problem for each Token, which is what I
expected. Why
>>>>>>>>>> can't I create annotations using rules with more
than one word?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>   Hello guys,how are you doing?
>>>>>>>>>>> I would like to know once i have called RUTA
from a Java project,
>>>>>>>>>>> how can i mark consecutive tokens as a "Problem"
(the name of my
>>>>>>>>>>> annotation, in this case)?
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advice!
>>>>>>>>>>>
>>>>>>>>>>>


Mime
View raw message