uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Diego Buoro <jklpo...@gmail.com>
Subject Re: Marking cosnecutive tokens with RUTA
Date Fri, 12 Jun 2015 19:38:43 GMT
Hello Peter

We tried your suggestions and it worked liked a charm,thanks :D
However, we are facing another problem: It seems that our application isn't
creating the mainTypesystem and mainEngine files when we launch it. We
don't know whether or not that's is the default behavior, but for now we
are having to create these files in separate project and them copy them to
the application whenever we change the script, which is a bad solution.
Doy you have any suggestions?

All Best,

Diego

2015-06-12 9:19 GMT-03:00 Diego Buoro <jklports@gmail.com>:

> Hi Peter, Armin
>
> Thanks for the observations made, i hope we can finally get working here.
> We will try the changes in the next few days and then give you a feedback.
>
> All Best,
>
> Diego
>
>
>
> 2015-06-03 14:14 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>
>> Hi Peter, the example we used is the small sentence inside a string at
>> the end of UIMAChecker.java: "Refiro-me à trabalho remunerado.".
>> Based on the Main.ruta we sent you, we expected the output to contain 7
>> "PROBLEM" annotations. This part is working.
>> The problem is when we change the last line of Main.ruta from
>> "cgToken{->PROBLEM};" to "cgToken cgToken{->PROBLEM};"in this case we
>> expected 6 "PROBLEM" annotations: the same ones we had on the first
>> example, excpect for the first one.That's what happens when you run the
>> script on a simple Ruta project, but when we run it in the  Java
>> application we get 0 "PROBLEM" annotations.
>> We think this difference is happening because in the Ruta project we
>> don't use a simple text as input.Instead, we feed it a preprocessed xmi
>> file. On the other hand on the Java application, we do the processing
>> ourselves via the processCas method. It's possible that the processCas
>> method is creating tokens in a way that prevents us from detecting when one
>> is next to the other on the Ruta script.
>> We are sending you the xmi file to use as an example for a simple Ruta
>> project. If there are any other examples you'd like us to send you, just
>> say the word :D
>>
>> Best,
>>
>> Diego
>>
>> 2015-06-01 11:15 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>
>>> Sorry,please disregard my last answer. The idea wasn't to use the xmi,
>>> we are still thinking in a minimal example to provide to you.
>>> We will send you in the next few days.
>>>
>>> 2015-06-01 10:37 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>
>>>> Hi Peter,how are you doing?
>>>>
>>>> We were trying to run using the files such as Crase01.xmi and
>>>> rule_xml_001.xmi.
>>>> Our goal is trying to run those two more simpler first,and then run
>>>> with Crase.xmi.
>>>>
>>>> About the package declaration, i still need to check what ruta version
>>>> is.
>>>> I will be checking this soon.
>>>>
>>>> All Best,
>>>>
>>>> Diego
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2015-05-30 0:45 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>
>>>>> Hi Peter!
>>>>> No problem, I appreciate your support.
>>>>>
>>>>> All Best,
>>>>>
>>>>> Diego
>>>>>
>>>>> 2015-05-27 14:22 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>
>>>>>> Hi Peter!
>>>>>> We call the script with the following lines:
>>>>>>
>>>>>>  URL url = Resources.getResource("Main.ruta");
>>>>>> String text = Resources.toString(url, Charsets.UTF_8);
>>>>>>  AnalysisEngineDescription aeDes =
>>>>>> Ruta.createAnalysisEngineDescription(text, tsd);
>>>>>> this.ae = UIMAFramework.produceAnalysisEngine(aeDes);
>>>>>>
>>>>>> CAS cas = ae.newCAS();
>>>>>> converter.populateCas(sentence.getTextSentence(), cas);
>>>>>>  ae.process(cas);
>>>>>>
>>>>>> The populateCAS method is responsible for translating our annotations
>>>>>> into RUTA annotations, but it doesn't set any type priority explicitly.
>>>>>> We don't know much about type priorities, the RUTA references we
>>>>>> found say very little about that.Are they necessary for doing what
we need?
>>>>>>
>>>>>> The file that contains the above lines is available here:
>>>>>>
>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/UIMAChecker.java
>>>>>> The processCAS mehtod is available here:
>>>>>>
>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-gc/src/main/java/org/cogroo/tools/checker/checkers/uima/UimaCasAdapter.java
>>>>>> The script we are calling is available here:
>>>>>>
>>>>>> https://github.com/Fichberg/cogroo4/blob/labXP215_Will/cogroo-ruta/script/Main.ruta
>>>>>>
>>>>>> PS:Yes, We remembered the semicolons.
>>>>>>
>>>>>> Thanks for the help :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2015-05-26 15:30 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>
>>>>>>> I think i wasn't clear enough, and i should be more specific.
>>>>>>>
>>>>>>> I have a type system in which all words have been annotated as
>>>>>>> Tokens. I am calling a RUTA script from a java class, and that
script has
>>>>>>> only one rule:
>>>>>>> Token Token {-> Problem}
>>>>>>>
>>>>>>> However, with this script, no Problems are created. When I try
>>>>>>> Token {-> Problem}
>>>>>>>
>>>>>>> I get one problem for each Token, which is what I expected. Why
>>>>>>> can't I create annotations using rules with more than one word?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2015-05-26 14:49 GMT-03:00 Diego Buoro <jklports@gmail.com>:
>>>>>>>
>>>>>>>> Hello guys,how are you doing?
>>>>>>>>
>>>>>>>> I would like to know once i have called RUTA from a Java
project,
>>>>>>>> how can i mark consecutive tokens as a "Problem" (the name
of my
>>>>>>>> annotation, in this case)?
>>>>>>>>
>>>>>>>> Thanks in advice!
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message