uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Diego Buoro <jklpo...@gmail.com>
Subject Re: Marking cosnecutive tokens with RUTA
Date Wed, 17 Jun 2015 16:30:41 GMT
Hi, Peter! We are attempting to create the descriptors based on Ruta 2.3,
but we're out of luck. We've added the lines from the link you gave us to
the pom.xml file and corrected the directory paths to suit our project.
However, when we try to run Maven with Ruta's "generate" goal, no files got
generated on the folders we set. Is the goal supposed to generate the files
and leave them in the folder or does it do something else?

Here is the link to our altered pom.xml. The plugin section is at the end
of the file:
https://raw.githubusercontent.com/Fichberg/cogroo4/labXP215_Will/cogroo-gc/pom.xml

Thanks for the help so far. :D

2015-06-14 9:40 GMT-03:00 Peter Klügl <peter.kluegl@averbis.com>:

> Hi,
>
> the descriptor are always created at compile time.
>
> In Ruta 2.2.1, yes, you need to create the descriptors in the UIMA Ruta
> Workbench and then copy them or make them available in some other way. This
> is especially necessary if you declare additional types (type system
> descriptor changes) or add some subscript (analysis engine descriptor
> changes).
>
> In Ruta 2.3.0 which was just released, there is a maven plugin for
> building the descriptors. Take a look at:
> http://uima.apache.org/d/ruta-current/tools.ruta.book.html#d5e3271
> This means that you do not need the UIMA Ruta Workbench projects anymore,
> but you can use its development support and descriptor building in normal
> maven projects.
>
> Best,
>
> Peter
>
>
> Am 12.06.2015 um 21:38 schrieb Diego Buoro:
>
>> Hello Peter
>>
>> We tried your suggestions and it worked liked a charm,thanks :D
>> However, we are facing another problem: It seems that our application
>> isn't
>> creating the mainTypesystem and mainEngine files when we launch it. We
>> don't know whether or not that's is the default behavior, but for now we
>> are having to create these files in separate project and them copy them to
>> the application whenever we change the script, which is a bad solution.
>> Doy you have any suggestions?
>>
>> All Best,
>>
>> Diego
>>
>> 2015-06-12 9:19 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>
>>  Hi Peter, Armin
>>>
>>> Thanks for the observations made, i hope we can finally get working here.
>>> We will try the changes in the next few days and then give you a
>>> feedback.
>>>
>>> All Best,
>>>
>>> Diego
>>>
>>>
>>>
>>> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>
>>>  Hi Peter, the example we used is the small sentence inside a string at
>>>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>>>> Based on the Main.ruta we sent you, we expected the output to contain 7
>>>> "PROBLEM" annotations. This part is working.
>>>> The problem is when we change the last line of Main.ruta from
>>>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case
we
>>>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>>>> example, excpect for the first one.That's what happens when you run the
>>>> script on a simple Ruta project, but when we run it in the  Java
>>>> application we get 0 "PROBLEM" annotations.
>>>> We think this difference is happening because in the Ruta project we
>>>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>>>> file. On the other hand on the Java application, we do the processing
>>>> ourselves via the processCas method. It's possible that the processCas
>>>> method is creating tokens in a way that prevents us from detecting when
>>>> one
>>>> is next to the other on the Ruta script.
>>>> We are sending you the xmi file to use as an example for a simple Ruta
>>>> project. If there are any other examples you'd like us to send you, just
>>>> say the word :D
>>>>
>>>> Best,
>>>>
>>>> Diego
>>>>
>>>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>
>>>>  Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>>>> we are still thinking in a minimal example to provide to you.
>>>>> We will send you in the next few days.
>>>>>
>>>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>
>>>>>  Hi Peter,how are you doing?
>>>>>>
>>>>>> We were trying to run using the files such as Crase01.xmi and
>>>>>> rule_xml_001.xmi.
>>>>>> Our goal is trying to run those two more simpler first,and then run
>>>>>> with Crase.xmi.
>>>>>>
>>>>>> About the package declaration, i still need to check what ruta version
>>>>>> is.
>>>>>> I will be checking this soon.
>>>>>>
>>>>>> All Best,
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>
>>>>>>  Hi Peter!
>>>>>>> No problem, I appreciate your support.
>>>>>>>
>>>>>>> All Best,
>>>>>>>
>>>>>>> Diego
>>>>>>>
>>>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>
>>>>>>>  Hi Peter!
>>>>>>>> We call the script with the following lines:
>>>>>>>>
>>>>>>>>   URL url = Resources.getResource("Main.ruta");
>>>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>>>   AnalysisEngineDescription aeDes =
>>>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>>>
>>>>>>>> CAS cas = ae.newCAS();
>>>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>>>   ae.process(cas);
>>>>>>>>
>>>>>>>> The populateCAS method is responsible for translating our
>>>>>>>> annotations
>>>>>>>> into RUTA annotations, but it doesn't set any type priority
>>>>>>>> explicitly.
>>>>>>>> We don't know much about type priorities, the RUTA references
we
>>>>>>>> found say very little about that.Are they necessary for doing
what
>>>>>>>> we need?
>>>>>>>>
>>>>>>>> The file that contains the above lines is available here:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>>>> The processCAS mehtod is available here:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>>>> The script we are calling is available here:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>>>
>>>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>>>
>>>>>>>> Thanks for the help :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>>
>>>>>>>>  I think i wasn't clear enough, and i should be more specific.
>>>>>>>>>
>>>>>>>>> I have a type system in which all words have been annotated
as
>>>>>>>>> Tokens. I am calling a RUTA script from a java class,
and that
>>>>>>>>> script has
>>>>>>>>> only one rule:
>>>>>>>>> Token Token {-> Problem}
>>>>>>>>>
>>>>>>>>> However, with this script, no Problems are created. When
I try
>>>>>>>>> Token {-> Problem}
>>>>>>>>>
>>>>>>>>> I get one problem for each Token, which is what I expected.
Why
>>>>>>>>> can't I create annotations using rules with more than
one word?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>>>
>>>>>>>>>  Hello guys,how are you doing?
>>>>>>>>>>
>>>>>>>>>> I would like to know once i have called RUTA from
a Java project,
>>>>>>>>>> how can i mark consecutive tokens as a "Problem"
(the name of my
>>>>>>>>>> annotation, in this case)?
>>>>>>>>>>
>>>>>>>>>> Thanks in advice!
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message