uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Kl├╝gl <peter.klu...@averbis.com>
Subject Re: best practice for building RUTA scripts in Eclipse when they are to be run in Java?
Date Thu, 14 Jan 2016 16:19:45 GMT
Hi,

just a few short first comments... more tomorrow...

- Unfortunately, the images did not make it (due to the mailing list
settings?). You can send me the mail directly if you want.
- I really prefer now to develop ruta script in maven built projects. Is
maven an option for you?
- You can limit JCasGen to the current project. Then, only local type
systems are used to generate the classes and the problem with overriding
RutaBasic is avoided. However, if you copy the descriptors, that does
not help.
- JCasGen on generated type systems of ruta scripts can be tricky
(because ruta imports the BasicTypeSystem by default and this one should
not be generated anew). I rather recommned to define JCas cover class
type in separate type systems.
- Copying descriptors should be avoided in general
- Do you need the descriptors of ruta at all? Did you define new types
in ruta scripts? The java code does not make use of the ruta descriptors
- The way you create the ruta descriptors in the java example does not
support all ruta functionality, e.g. , new types
- The duplicate import is fixed in the next release
- Is the code open source somewhere, e.g., on github?

Best,

Peter

Am 14.01.2016 um 16:13 schrieb Bonnie MacKellar:
> Hi,
>
> I just spent the last 4 days stumbling through the documentation,
> tutorials, posts to this mailing list, and any code examples I could
> find on the Internet, so I could integrate the Metamap annotator and a
> RUTA script in Java using UimaFit. I succeeded, and I have something
> that runs, but I doubt I am organizing things the best way in Eclipse,
> and in particular, I am noticing some odd things if I try to build and
> test the script first in the Ruta development environment in Eclipse
> and then move the script to my Java environment. I suspect my workflow
> is not the best possible, so I am looking for advice on how to manage
> this.
>
> My project was created as a Ruta project so I could have the
> development environment support. I then added Uima nature to the
> project to get the Java development folders. I set up the type
> descriptors for Metamap, and after much reading, realized I needed a
> types.txt file in my source folder that tells the system how to find
> the Metamap type descriptors. I then added the Ruta script to the
> pipeline in my Java class and then copied the type descriptor for that
> down to my source folders as well. Finally, I realized I needed java
> classes for the types, and that pressing a jCasGen button in the
> ComponentDescriptorEditor was the way to do that. However, there are
> some anomalies when I do this.
>
> So, my project has this structure at the top level
>
> Inline image 1
>
> and at the src level, this is the structure. Notice that the Ruta
> script and types have been copied down to this level
>
> Inline image 2
>
>
> The code that creates the AnalysisEngineDescriptors and runs the
> pipeline looks like this (it is in PipelineSystem. java)
>
> try {
> ae =
> AnalysisEngineFactory.createEngine(gov.nih.nlm.nls.metamap.uima.MetaMapAnnotator.class);
> AnalysisEngineDescription mmEngineDesc =
> AnalysisEngineFactory.createEngineDescription(gov.nih.nlm.nls.metamap.uima.MetaMapAnnotator.class);
>  
> AnalysisEngine rae =
> AnalysisEngineFactory.createEngine(RutaEngine.class,
> RutaEngine.PARAM_MAIN_SCRIPT,
>            "testrules");
> AnalysisEngineDescription rutaEngineDesc =
> AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
> RutaEngine.PARAM_MAIN_SCRIPT,
>            "testrules");
> JCas jCas = ae.newJCas();
> jCas.setDocumentText("serum albumin greater or equal 2g/dL");
> SimplePipeline.runPipeline(jCas, mmEngineDesc, rutaEngineDesc);
> displayResults(jCas);
> displayRutaResults(jCas);
>
> and the types.txt file contains this
> classpath*:desc/types/MetaMapApiTypeSystem.xml
> classpath*:desc/types/BasicTypeSystem.xml
> classpath*:desc/types/InternalTypeSystem.xml
> classpath*:desc/types/testrulesTypeSystem.xml
>
>
> If I want to use the Ruta Workbench to develop my Ruta script, it
> appears that I have to regenerate the java type files, such as
> Relational.java, each time I make a change. Is that correct?
> And when I do this, I notice that it completely regenerates the
> org.apache.uima.ruta.type hierarchy, which leads to an odd runtime
> error  (NoSuchMethodException, caused by trying to call
> setLowMemoryProfile). I read a chain on this list about this error
> which recommended to delete the regenerated uima type hierachy. This
> worked, but it seems I have to go through these steps every time I
> regenerate the Ruta types, which is a pain.
>
> Also, I notice that the metamap type hierarchy is also regenerated
> inside my project. I theorize it is because of the import in my Ruta
> type descriptor
> TYPESYSTEM BasicTypeSystem;
> TYPESYSTEM BasicMetaMapTypeSystem;
> TYPESYSTEM MetaMapApiTypeSystem;
> DECLARE Relational,UMLSConcept;
> Candidate{ -> MARK(UMLSConcept)};
>
> is this not the right way to make my script aware of the Metamap types?
>
> I also notice that in the type descriptor, this import is generated twice
> <imports>
>         <import location="BasicTypeSystem.xml"/>
>         <import location="BasicTypeSystem.xml"/>
>     </imports>
>
> In general, is it a good or bad idea to develop the Ruta script in the
> workbench and then copy its pieces into the Java source folder? It
> seems like a very convoluted process.
>
> Thanks for your help
>
> Bonnie MacKellar


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message