uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: JCasGen like utility for generating POJO Classes in the type system
Date Tue, 16 Oct 2018 07:45:39 GMT
On 14. Oct 2018, at 00:31, Amit Paradkar <amp6064@gmail.com> wrote:
> 
> Thanks Marshall.
> I would like to construct objects in the typesystem that I have defined
> without having to pass in a cas object in the constructor (after I have
> detected features for a particular class in the text)
> 
> e.g, I have a Database type defined in my UIMA type descriptor based on
> which JCasGen generates a JCas cover class called named Databaase.
> But all the public constructors in this class take a JCas object as an
> argument which precludes an independent construction of instances of these
> classes.
> I need to be able to construct an instance of such classes independent of
> any cas objects - since I need them to persist even after the the current
> input text is being processed completely. So, almost a mirror typesystem is
> desired. Perhaps I am not using the notion of typesystem appropriately...

JCas cover classes are bound to a (J)CAS because. One simple reason is that
annotations do not contain the covered text themselves but need to query the
underlying (J)CAS to fetch a substring of the document text based on their
respective start/end positions.

Also, (J)CAS objects are often re-used. So retaining a JCas cover class instance
or a FeatureStructure for a longer period of time is usually a bad idea.

If you need to persist data, you could:

* store the entire (J)CAS including the text an all annotations - e.g. in XMI or binary format
  or even JSON. You can then later create an empty CAS and load the data back in (except the

  JSON format for which we do not yet have code to load the data back).
* iterate only over those annotations in the CAS interesting to you, create a generic POJO
  (e.g. a kind of Map), copy all relevant information from the annotation over into the POJO
  and then persist that. In order to do this, you might want to use the CAS API which allows

  not only iterating over annotations, but also over their features.
* you *could* also use the POJO approch with JCas, create one POJO class for every JCas
  class that you have, and then copy the information from the annotations over into your
  POJO classes. But I believe in this case you'd have to write a lot of copy code and you'd
  have to adapt that code every time your type system changes. Of course, you could use
  the Java reflection API to make the code more generic (e.g. by getting a list of 
  getters/setters on the JCas classes), but then you could also just use the CAS API
  and get the same effect with simpler and faster code.

Cheers,

-- Richard
Mime
View raw message