uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Petr Baudis <pa...@ucw.cz>
Subject Copying a CAS subset with offset correction
Date Sun, 27 Apr 2014 22:10:30 GMT
  Hi!

  I'm trying to figure out how to reliably do deep copies from one CAS
to another where the sofa of the target CAS is a subset of the source
CAS. E.g. copying from the previous sentence to "do deep copies from
one CAS to another".

  One approach is to simply do something like

	int ofs = subCasSpan.getBegin();
	CasCopier copier = new CasCopier(srcCas.getCas(), dstCas.getCas());
	for (Annotation a : JCasUtil.selectCovered(Annotation.class, subCasSpan)) {
		Annotation a2 = (Annotation) copier.copyFs(a);
		a2.setBegin(a2.getBegin() - ofs);
		a2.setEnd(a2.getEnd() - ofs);
		a2.addToIndexes();
	}

However, the problem is when the featureset contains references to other
featuresets; if these are outside the span, their offsets will not get
modifies and these "hidden" featuresets will remain referenced but
become nonsensical and misleading, instead of ideally the featuresets
not being copied and replaced by null references.

  I don't think this is something that's easily achievable right now?
(The possible annotation types are an open set, manual per-annotation
handling of references is not feasible in my case.)

  I think the most reasonable solution would be to introduce a way to
specify an offset span for the CasCopier (or a subclass), with
annotations dropped if they are outside of the offset span?

  Thanks,

				Petr "Pasky" Baudis

Mime
View raw message