uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eddie Epstein <eaepst...@gmail.com>
Subject Re: Problems with "deserializeCasFromXmi" after using C++ AS Annotator
Date Thu, 10 Dec 2009 21:47:22 GMT
Hi Christoph,

I could not reproduce the problem, with 2.2.2 or the latest 2.3.0
release candidate code.

My scenario was:
1. add the following feature to type David:
  	<featureDescription>
  	  <name>documentURL</name>
  	  <description></description>
  	  <rangeTypeName>uima.cas.String</rangeTypeName>
  	</featureDescription>

2. add this code to DaveDetector.cpp:
      AnnotationFS fsNewExp =
        tcas.createAnnotation(david, uiExprBeginPos, uiExprEndPos);
+      Feature documentURL  = david.getFeatureByBaseName("documentURL");
+      fsNewExp.setStringValue(documentURL,
"http://www.gesundheitsnachrichten.net/live/navigation/live.php?navigation_id=11&_psmand=1");
      indexRep.addFS(fsNewExp);

3. start DaveDetector as a service:
    deployCppService descriptors\DaveDetector.xml DaveDetector

4. create a DaveDetector JMS service descriptor pointing at the service:
<customResourceSpecifier xmlns="http://uima.apache.org/resourceSpecifier">
   <resourceClassName>org.apache.uima.aae.jms_adapter.JmsAnalysisEngineServiceAdapter</resourceClassName>
   <parameters>
     <parameter name="brokerURL" value="tcp://localhost:61616"/>
     <parameter name="endpoint" value="DaveDetector"/>
   </parameters>
</customResourceSpecifier>

5. use cvd to connect to the service, put a Dave and David in the
text, call it, and inspect the results.

Please do provide the modifications to DaveDetector and a details
description of the scenario.

Regards,
Eddie

On Thu, Dec 10, 2009 at 10:43 AM, Christoph Büscher
<christoph.buescher@neofonie.de> wrote:
> Hi again,
>
> I was able to reproduce the problem also with the DaveDetector now. I wrote
> a short unit test that I can provide upon request to demonstrate the
> problem.
>
> Christoph
>
> Christoph Büscher schrieb:
>>
>> Hi,
>>
>> I currently encountered a problem with the XMI deserialization of a
>> feature structure after calling a remote C++ AS annotator from a CPE. The
>> szenario is the following:
>>
>> 1. I add a custom feature structure "DocumentData" containing an String
>> Feature (the document URL) to the CAS in my CPE. The exact URL causing the
>> problem is:
>>
>>
>> documentURL="http://www.gesundheitsnachrichten.net/live/navigation/live.php?navigation_id=11&_psmand=1"
>>
>> 2. The CAS get's serialized to XMI before sending it to a remote C++ TAE.
>> I added a breakpoint to UimaSerializer.serializeCasToXmi() and get the
>> following part in the XMI string:
>>
>>
>> documentURL="http://www.gesundheitsnachrichten.net/live/navigation/live.php?navigation_id=11&amp;_psmand=1"
>>
>> So here the "&" character seems to be excaped correctly.
>>
>> 3. When the document comes back, the same feature in the XMI string
>> received in UimaSerializer.deserializeCasFromXmi() reads:
>>
>>
>> documentURL="http://www.gesundheitsnachrichten.net/live/navigation/live.php?navigation_id=11&_psmand=1"
>>
>> an now the SAXParser throws the following exception:
>>
>> org.xml.sax.SAXParseException: The reference to entity "_psmand" must end
>> with the ';' delimiter.
>>    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
>>    at
>> org.apache.uima.aae.UimaSerializer.deserializeCasFromXmi(UimaSerializer.java:170)
>>    at ...
>>
>> because the "&" comes back unescaped. I'm sure the C++ annotator in
>> question doesn't change the feature in question and it also correctly adds
>> its own annotations. I suspect there's something wrong in
>> deserializing/serializing the CAS from XMI and back on the C++ side of
>> things.
>> Do you have any idea what might cause this problem or any suggestion where
>> I can start to further narrow down the problem?
>>
>> The remote C++ AE is running with version "uimacpp-2.2.2-incubating".
>>
>>
>>
>
>
> --
> --------------------------------
> Christoph Büscher
> Softwareentwicklung
>
> neofonie
> Technologieentwicklung und
> Informationsmanagement GmbH
> Robert-Koch-Platz 4
> 10115 Berlin
> fon: +49.30 24627 522
> fax: +49.30 24627 120
> http://www.neofonie.de
>
> Handelsregister
> Berlin-Charlottenburg: HRB 67460
>
> Geschäftsführung
> Helmut Hoffer von Ankershoffen
> (Sprecher der Geschaeftsfuehrung)
> Nurhan Yildirim
>

Mime
View raw message