From user-return-8382-archive-asf-public=cust-asf.ponee.io@uima.apache.org Wed Oct 21 20:34:48 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id 7ECA718063B for ; Wed, 21 Oct 2020 22:34:48 +0200 (CEST) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id D13F8651E3 for ; Wed, 21 Oct 2020 20:34:46 +0000 (UTC) Received: (qmail 65291 invoked by uid 500); 21 Oct 2020 20:34:46 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 65275 invoked by uid 99); 21 Oct 2020 20:34:45 -0000 Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Oct 2020 20:34:45 +0000 Received: from bluefire.fritz.box (ip-95-223-117-140.hsi16.unitymediagroup.de [95.223.117.140]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 481BA47161; Wed, 21 Oct 2020 20:34:45 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\)) Subject: Re: JCasGen support for CAS-transported custom Java objects From: Richard Eckart de Castilho In-Reply-To: Date: Wed, 21 Oct 2020 22:34:42 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <4FB32513-171D-4926-A5C3-56699CC04665@apache.org> References: <6098A163-A2C2-4F51-998A-365512F6E693@cactusglobal.com> <31259683-B21B-4416-9802-AA36D6E3BF32@cactusglobal.com> <52F4AAFF-810B-452C-A376-C7816D1DDEF3@apache.org> To: user@uima.apache.org X-Mailer: Apple Mail (2.3608.120.23.2.1) Hi Mario, > On 21. Oct 2020, at 21:26, Mario Juric = wrote: >=20 > We never had problems migrating from one type system as long the types = where either extended or something was deleted. The problem we had was = when an attribute changed type, e.g. a change from a simple FSArray to a = wrapper type with the custom java object and a FSArray. We tried = something similar last year where a type A had an FSArray attribute with = elements of another type B that previously inherited from Annotation, = and we changed that to inherit from TOP instead, while all of the = attributes of B, that we had declared, remained unchanged. Not = surprisingly the deserialiser couldn=E2=80=99t load the old CAS = leniently with this change, and we never figured out how to do a = conversion, if that is at all possible, since A can only take one form, = i.e. we haven=E2=80=99t figured out how to have two versions of A = simultaneously in order to make a conversion. Maybe there are some lower = level CAS possibilities that we are not aware of yet. The problem should = be the same when changing the type of an attribute from FSArray to a = wrapper type with custom java objects. Ok, I think I get the picture now. I was imagining to create a new type = that would replace the old and basically copying the data over into the = new structure. You are thinking of basically modifying a type = "in-place". I think this is doable in the following way: 1) create a CAS "oldCas" with your existing type system CAS oldCas =3D CasFactory.createCas( = TypeSystemDescriptionFactory.createTypeSystemDescriptionFromPath("old_type= system.xml"); 2) create a CAS "newCas" with your new type system CAS newCas =3D CasFactory.createCas( = TypeSystemDescriptionFactory.createTypeSystemDescriptionFromPath("new_type= system.xml"); 3) implement a method taking two CASes and coping the data from one to = the other while massaging relevant feature structures according to the changes in the = type system void copyAndUpgradeCas(CAS oldCas, CAS newCas) { // Recursively collect all accessible feature structures in oldCas // for each feature structure, create a copy in newCas // If the feature structure is of a type which changed, copy data = according to the changes // otherwise, copy it 1-to-1 (or at least the primitive values) // collect a reference which old FS was mapped to which new FS which = can be used to connect // FS references in a second pass // in a second pass copy/convert the FS references (i.e. non-primitive = features) // Optionally repeat the process for other views in the CAS } (Basically step 3 is in a sense CasCopier - just a custom one where you = apply a data transformation instead of just copying the data.) Important for this to work is that you are using the CAS API and stay = away from the JCas API! If you had XMI data instead of binary CASes, I would have suggested that = DKPro Cassis might be a route to explore. With this library, you can = load XMI CAS objects into Python and Python objects are notoriously = flexible and malleable - much more so than CAS / JCas objects. I didn't = dig into it, but I could imagine that a CAS and type system loaded using = DKPro Cassis could be monkey-patched in-place into a new structure. But = then again, I haven't tried using Cassis for this purpose but I am quite = confident that the Java-based approach I outlined above should be doable. Cheerio, -- Richard