uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Büscher <christoph.buesc...@neofonie.de>
Subject Problems with "deserializeCasFromXmi" after using C++ AS Annotator
Date Thu, 10 Dec 2009 13:53:54 GMT

I currently encountered a problem with the XMI deserialization of a feature 
structure after calling a remote C++ AS annotator from a CPE. The szenario is 
the following:

1. I add a custom feature structure "DocumentData" containing an String Feature 
(the document URL) to the CAS in my CPE. The exact URL causing the problem is:


2. The CAS get's serialized to XMI before sending it to a remote C++ TAE. I 
added a breakpoint to UimaSerializer.serializeCasToXmi() and get the following 
part in the XMI string:


So here the "&" character seems to be excaped correctly.

3. When the document comes back, the same feature in the XMI string received in 
UimaSerializer.deserializeCasFromXmi() reads:


an now the SAXParser throws the following exception:

org.xml.sax.SAXParseException: The reference to entity "_psmand" must end with 
the ';' delimiter.
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at ...

because the "&" comes back unescaped. I'm sure the C++ annotator in question 
doesn't change the feature in question and it also correctly adds its own 
annotations. I suspect there's something wrong in deserializing/serializing the 
CAS from XMI and back on the C++ side of things.
Do you have any idea what might cause this problem or any suggestion where I can 
start to further narrow down the problem?

The remote C++ AE is running with version "uimacpp-2.2.2-incubating".

Christoph Büscher

Technologieentwicklung und
Informationsmanagement GmbH
Robert-Koch-Platz 4
10115 Berlin
fon: +49.30 24627 522
fax: +49.30 24627 120

Berlin-Charlottenburg: HRB 67460

Helmut Hoffer von Ankershoffen
(Sprecher der Geschaeftsfuehrung)
Nurhan Yildirim

View raw message