uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: No sofaFS for specified sofaRef found
Date Fri, 13 May 2016 13:33:21 GMT
Thanks for your report, analysis, and the data; very thorough and helpful.

I was trying to imagine how the input XMI file could have become "corrupted" in
this manner.  I'm guessing that what may have happened is that something outside
of the XMI file changed - for instance, the type system.

This situation (of the sofa attribute being missing) could happen if the type
definition for DocumentMetadata changed.

I searched the source code for UIMA for "DocumentMetadata" as a UIMA type, and
came up empty, so I'm guessing this is some type that was defined in your
particular application.  I see in your note that you included a description of
the type system defining DocumentMeta, and it shows that the supertype hierarchy
of this type is:

  uima.tcas.DocumentAnnotation, whose supertype is
  uima.tcas.Annotation, whose supertype is
  uima.cas.AnnotationBase, whose supertype is
  uima.cas.TOP

I'm guessing that at the time the XMI serialized form was created, a different
type system was being used that defined the supertype hierarchy for
DocumentMetadata such that it did **not** have uima.cas.AnnotationBase in it
super type hierarchy.  This would mean that it did not have the "sofa" feature.

Can you perhaps confirm that this (changing the type system in this manner) was
probably the cause of this?

-----------------

In figuring out what's the best thing to, it seems there are multiple conditions
to maybe try and catch.  To cause this failure:

  1) a type was deserialized where the type system had some features that were
missing in the serialization

  2) the particular feature "sofa" was missing

  3) the feature structure with the missing sofa feature was in a list in the
serialization specifying it was to be added to the indexes

(1) is not generally an error; it is allowed to permit evolution of type systems
(in a compatible way) over time. For example, new features could be added.  Any
features not in the serialization are set to their default values.

(2) even this is not (necessarily) an error (but it is bad practice).  For
example, you might be using the initial view, and might have never created a
Sofa.  (Sofas are always created if you create a view programatically). 

Later versions of UIMA check for the sofa feature missing when attempting to add
a Feature Structure to the indexes; this test was not always there, and, for
backwards compatibility, it can be disabled with a
-Duima.disable_enhanced_check_wrong_add_to_index JVM property.

----------------

Because of this, I'm planning to leave the detection of this alone, but will
change the error message to indicate some potential causes of this situation,
including that the type system definition changed for this type from one not
having uima.cas.AnnotationBase in the hierarchy when the serialized form was
created, to the current type system being used for deserialization, which does
(especially, if you can confirm this was the likely cause).

Thank you very much for your report and analysis!

-Marshall



On 5/6/2016 6:30 PM, Pablo N. Mendes wrote:
> Folks,
> I am getting "No sofaFS for specified sofaRef found" while trying to
> deserialize an XMI. I found the message a bit cryptic and didn't find much
> help on the lazyweb, so I bit the bullet and spent a few hours poking
> around. It seems to be a missing "sofa" attribute. If the sofa attribute
> has the wrong value, then you get "xmi id <id> is referenced but not
> defined" which is very nice and clear. But if you omit the sofa attribute
> you get "No sofaFS for specified sofaRef found" which is less informative
> IMHO.
>
> Extra info below.
>
> Cheers,
> Pablo
>
> $ diff cas1.xmi cas2.xmi
> 9c9
> < <ls:DocumentMetadata xmi:id="18" sofa="1" source="file001.txt"
> documentId="001"/>
> ---
>> <ls:DocumentMetadata xmi:id="18" source="file001.txt" documentId="001"/>
>
>
>
> VERSIONS
>
>     <uima.version>2.8.1</uima.version>
>     <uimafit.version>2.1.0</uimafit.version>
>
> JAVA CODE SNIPPET
>
> org.apache.uima.util.XmlCasDeserializer.deserialize(inputStream,
> jCas.getCas());
>
> STACK TRACE
>
> Exception in thread "main" org.apache.uima.cas.CASRuntimeException: No
> sofaFS for specified sofaRef found.
> at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:806)
> at
> org.apache.uima.cas.impl.FSIndexRepositoryImpl.ll_addFS_common(FSIndexRepositoryImpl.java:2781)
> at
> org.apache.uima.cas.impl.FSIndexRepositoryImpl.ll_addFS(FSIndexRepositoryImpl.java:2763)
> at
> org.apache.uima.cas.impl.FSIndexRepositoryImpl.addFS(FSIndexRepositoryImpl.java:2068)
> at
> org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.endDocument(XmiCasDeserializer.java:1486)
> at
> org.apache.uima.util.XmlCasDeserializer$XmlCasDeserializerHandler.endDocument(XmlCasDeserializer.java:127)
> at org.apache.xerces.parsers.AbstractSAXParser.endDocument(Unknown Source)
> at org.apache.xerces.impl.XMLDocumentScannerImpl.endEntity(Unknown Source)
> at org.apache.xerces.impl.XMLEntityManager.endEntity(Unknown Source)
> at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
> at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source)
> at
> org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dispatch(Unknown
> Source)
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
> Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
> at
> org.apache.uima.util.XmlCasDeserializer.deserialize(XmlCasDeserializer.java:83)
> at
> org.apache.uima.util.XmlCasDeserializer.deserialize(XmlCasDeserializer.java:58)
> ...
>
>
> CAS1.xmi
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xmi:XMI
> xmlns:cas="http:///uima/cas.ecore"
> xmlns:tcas="http:///uima/tcas.ecore"
> xmlns:xmi="http://www.omg.org/XMI"
> xmlns:ls="http:///com/example.ecore"
> xmi:version="2.0">
> <cas:NULL xmi:id="0"/>
> <ls:DocumentMetadata xmi:id="18" sofa="1" source="file001.txt"
> documentId="001"/>
> <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text"
> sofaString="This is a test."/>
> <cas:View sofa="1" members="18"/>
> </xmi:XMI>
>
>
> CAS2.xmi
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xmi:XMI
> xmlns:cas="http:///uima/cas.ecore"
> xmlns:tcas="http:///uima/tcas.ecore"
> xmlns:xmi="http://www.omg.org/XMI"
> xmlns:ls="http:///com/example.ecore"
> xmi:version="2.0">
> <cas:NULL xmi:id="0"/>
> <ls:DocumentMetadata xmi:id="18" source="file001.txt" documentId="001"/>
> <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text"
> sofaString="This is a test."/>
> <cas:View sofa="1" members="18"/>
> </xmi:XMI>
>
>
> TYPESYSTEM
>
> <?xml version="1.0" encoding="UTF-8" ?>
>
> <typeSystemDescription  xmlns="http://uima.apache.org/resourceSpecifier">
>         <name>ExampleTypeSystem</name>
>         <description>Just an example</description>
>         <vendor>example.com</vendor>
>         <version>1.0</version>
>         <types>
>                 <typeDescription>
>                         <name>com.example.DocumentMetadata</name>
>                         <description></description>
>
> <supertypeName>uima.tcas.DocumentAnnotation</supertypeName>
>                         <features>
>                                 <featureDescription>
>                                         <name>source</name>
>                                         <description>Source</description>
>
> <rangeTypeName>uima.cas.String</rangeTypeName>
>                                 </featureDescription>
>                                 <featureDescription>
>                                         <name>documentId</name>
>                                         <description>Source</description>
>
> <rangeTypeName>uima.cas.String</rangeTypeName>
>                                 </featureDescription>
>                         </features>
>                 </typeDescription>
>
>         </types>
> </typeSystemDescription>
>
>


Mime
View raw message