incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject UIMA AS Binary vs XMI serialization
Date Thu, 03 Jan 2013 21:25:34 GMT
Hi,
I was just curious on others' experience with the binary serialization.

My original issue was documents which contained invalid XML chars, so I decided to try the
binary serialization option within AS instead of replacing/modifing the special chars in the
original docs.  As a side effect, I noticed that it's magnitudes of order faster;
Just curious if there were any reasons why not make this  the recommended/default when sending
CAS's around within AS.  Are there any downsides to be aware of (assuming that UIMA will have
wrappers to abstract this from users for all of their implementations.)

Caused by: org.xml.sax.SAXParseException; Trying to serialize non-XML 1.0 character: , 0x0
        at org.apache.uima.util.XMLSerializer$CharacterValidatingContentHandler.checkForInvalidXmlChars(XMLSerializer.java:254)

--Pei

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message