uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <richard.eck...@gmail.com>
Subject Re: getting Type without CAS
Date Fri, 12 Apr 2013 10:55:47 GMT
Hi Georg,

> Is there a way to obtain a specific UIMA-type without having access to a CAS ?

You can only get "Type" instances from a CAS (there may be ways around that, but it hardly
makes sense).

> At the moment I create a TypeSystemDescription and a CollectionReader. With that reader
I get my CASes. Only from those CASes I can get the TypeSystem from which I can get the type
I want. The code always looks like this
> void doSomething() {
>    TypeSystemDescription tsd = UIMAFramework.getXMLParser().parseTypeSystemDescription(new
>    CollectionReader cr = CollectionReaderFactory.createCollectionReader(MyReaderClass,
>    JCasIterable iterable = new JCasIterable(reader);
>    Type myType = null;
>    for (JCas aJCas : iterable) {
>        if (myType == null) {
>            myType = aJCAS.getCas().getType("my.type.atype1");
>        }
>        AnnotationIndex<Annotation> index = aJCas .getAnnotIndex(myType);
>        doSomethingWithTheAnnotations()...
>    }
> }

If you have JCas wrappers for your types, stuff gets much easier. E.g. if you have a type
"my.type.Atype1", you'd do something like

for (JCas jcas : new JCasIterable(reader)) {
  Collection<Atype1> annotations = JCasUtil.select(jcas, Atype1.class);
  // do something with the annotations

If you do not have JCas wrappers, then it's a bit more complicated. You can e.g. do this:

final static String ATYPE1 = "my.type.Atype1";
for (JCas jcas : new JCasIterable(reader)) {
  CAS cas = jcas.getCas();
  Collection<AnnotationFS) annotations = CasUtil.select(cas, CasUtil.getType(cas, ATYPE1));
  // do something with the annotations

With the CAS interface only, fetching features from the type is also more complicated. E.g.
with JCas wrappers you could do this:

  AType1 annotation = …
  annotation.setMyValue("this is my value");

With CAS it's like

  AnnotationFS annotation = …
  Feature f = annotation.getType().getFeatureByBaseName("myValue");
  annotation.setStringValue(f, "this is my value");

You may want to consider using a pre-defined type system (as opposed to defining types on-the-fly
in TextMarker) and generating JCas wrappers for them, which you then can use with uimaFIT
(as in the examples above with JCasUtil and CasUtil) or with the plain UIMA API that you used
in your code.

> It looks a bit ugly from the logical structure that the type has to be obtained insed
the for loop. It would think it desireable to have a method to get this type already outside
the loop.

In any case (CAS or JCas), the annotations do not survive the loop, because the CAS is re-used
within JCasIterable. You'd have to copy all values that you want to continue using. Most people
would not want to use JCasIterable, but rather write a new UIMA component which further processes
the CAS or which writes results to some file or database and use e.g. uimaFIT SimplePipeline.runPipeline(reader,
other-components…) to run these.


-- Richard

View raw message