uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pablo N. Mendes" <pablomen...@gmail.com>
Subject Re: No sofaFS for specified sofaRef found
Date Fri, 13 May 2016 17:27:19 GMT
Bingo! I created the posted type system by reverse engineering it from the
XMI provided to me by a third party. I have since requested the type system
XML from them and I can now see that the original DocumentMetadata inherits
directly from uima.cas.TOP! So, yes, the annotation did *not* inherit from
 AnnotationBase when it was serialized to XMI but I asked uimaj to create
an annotation that *does* inherit from AnnotationBase when deserializing it
back into a Java object.

Thank you for your time and for the detailed answer. I now see more clearly
the complexity of the issue. Great work!

Cheers,
Pablo

On Fri, May 13, 2016 at 6:33 AM, Marshall Schor <msa@schor.com> wrote:

> Thanks for your report, analysis, and the data; very thorough and helpful.
>
> I was trying to imagine how the input XMI file could have become
> "corrupted" in
> this manner.  I'm guessing that what may have happened is that something
> outside
> of the XMI file changed - for instance, the type system.
>
> This situation (of the sofa attribute being missing) could happen if the
> type
> definition for DocumentMetadata changed.
>
> I searched the source code for UIMA for "DocumentMetadata" as a UIMA type,
> and
> came up empty, so I'm guessing this is some type that was defined in your
> particular application.  I see in your note that you included a
> description of
> the type system defining DocumentMeta, and it shows that the supertype
> hierarchy
> of this type is:
>
>   uima.tcas.DocumentAnnotation, whose supertype is
>   uima.tcas.Annotation, whose supertype is
>   uima.cas.AnnotationBase, whose supertype is
>   uima.cas.TOP
>
> I'm guessing that at the time the XMI serialized form was created, a
> different
> type system was being used that defined the supertype hierarchy for
> DocumentMetadata such that it did **not** have uima.cas.AnnotationBase in
> it
> super type hierarchy.  This would mean that it did not have the "sofa"
> feature.
>
> Can you perhaps confirm that this (changing the type system in this
> manner) was
> probably the cause of this?
>
> -----------------
>
> In figuring out what's the best thing to, it seems there are multiple
> conditions
> to maybe try and catch.  To cause this failure:
>
>   1) a type was deserialized where the type system had some features that
> were
> missing in the serialization
>
>   2) the particular feature "sofa" was missing
>
>   3) the feature structure with the missing sofa feature was in a list in
> the
> serialization specifying it was to be added to the indexes
>
> (1) is not generally an error; it is allowed to permit evolution of type
> systems
> (in a compatible way) over time. For example, new features could be
> added.  Any
> features not in the serialization are set to their default values.
>
> (2) even this is not (necessarily) an error (but it is bad practice).  For
> example, you might be using the initial view, and might have never created
> a
> Sofa.  (Sofas are always created if you create a view programatically).
>
> Later versions of UIMA check for the sofa feature missing when attempting
> to add
> a Feature Structure to the indexes; this test was not always there, and,
> for
> backwards compatibility, it can be disabled with a
> -Duima.disable_enhanced_check_wrong_add_to_index JVM property.
>
> ----------------
>
> Because of this, I'm planning to leave the detection of this alone, but
> will
> change the error message to indicate some potential causes of this
> situation,
> including that the type system definition changed for this type from one
> not
> having uima.cas.AnnotationBase in the hierarchy when the serialized form
> was
> created, to the current type system being used for deserialization, which
> does
> (especially, if you can confirm this was the likely cause).
>
> Thank you very much for your report and analysis!
>
> -Marshall
>
>
>
> On 5/6/2016 6:30 PM, Pablo N. Mendes wrote:
> > Folks,
> > I am getting "No sofaFS for specified sofaRef found" while trying to
> > deserialize an XMI. I found the message a bit cryptic and didn't find
> much
> > help on the lazyweb, so I bit the bullet and spent a few hours poking
> > around. It seems to be a missing "sofa" attribute. If the sofa attribute
> > has the wrong value, then you get "xmi id <id> is referenced but not
> > defined" which is very nice and clear. But if you omit the sofa attribute
> > you get "No sofaFS for specified sofaRef found" which is less informative
> > IMHO.
> >
> > Extra info below.
> >
> > Cheers,
> > Pablo
> >
> > $ diff cas1.xmi cas2.xmi
> > 9c9
> > < <ls:DocumentMetadata xmi:id="18" sofa="1" source="file001.txt"
> > documentId="001"/>
> > ---
> >> <ls:DocumentMetadata xmi:id="18" source="file001.txt" documentId="001"/>
> >
> >
> >
> > VERSIONS
> >
> >     <uima.version>2.8.1</uima.version>
> >     <uimafit.version>2.1.0</uimafit.version>
> >
> > JAVA CODE SNIPPET
> >
> > org.apache.uima.util.XmlCasDeserializer.deserialize(inputStream,
> > jCas.getCas());
> >
> > STACK TRACE
> >
> > Exception in thread "main" org.apache.uima.cas.CASRuntimeException: No
> > sofaFS for specified sofaRef found.
> > at org.apache.uima.cas.impl.CASImpl.getSofa(CASImpl.java:806)
> > at
> >
> org.apache.uima.cas.impl.FSIndexRepositoryImpl.ll_addFS_common(FSIndexRepositoryImpl.java:2781)
> > at
> >
> org.apache.uima.cas.impl.FSIndexRepositoryImpl.ll_addFS(FSIndexRepositoryImpl.java:2763)
> > at
> >
> org.apache.uima.cas.impl.FSIndexRepositoryImpl.addFS(FSIndexRepositoryImpl.java:2068)
> > at
> >
> org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.endDocument(XmiCasDeserializer.java:1486)
> > at
> >
> org.apache.uima.util.XmlCasDeserializer$XmlCasDeserializerHandler.endDocument(XmlCasDeserializer.java:127)
> > at org.apache.xerces.parsers.AbstractSAXParser.endDocument(Unknown
> Source)
> > at org.apache.xerces.impl.XMLDocumentScannerImpl.endEntity(Unknown
> Source)
> > at org.apache.xerces.impl.XMLEntityManager.endEntity(Unknown Source)
> > at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
> > at org.apache.xerces.impl.XMLEntityScanner.skipSpaces(Unknown Source)
> > at
> >
> org.apache.xerces.impl.XMLDocumentScannerImpl$TrailingMiscDispatcher.dispatch(Unknown
> > Source)
> > at
> >
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
> > Source)
> > at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> > at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
> > at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
> > at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
> > at
> >
> org.apache.uima.util.XmlCasDeserializer.deserialize(XmlCasDeserializer.java:83)
> > at
> >
> org.apache.uima.util.XmlCasDeserializer.deserialize(XmlCasDeserializer.java:58)
> > ...
> >
> >
> > CAS1.xmi
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <xmi:XMI
> > xmlns:cas="http:///uima/cas.ecore"
> > xmlns:tcas="http:///uima/tcas.ecore"
> > xmlns:xmi="http://www.omg.org/XMI"
> > xmlns:ls="http:///com/example.ecore"
> > xmi:version="2.0">
> > <cas:NULL xmi:id="0"/>
> > <ls:DocumentMetadata xmi:id="18" sofa="1" source="file001.txt"
> > documentId="001"/>
> > <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text"
> > sofaString="This is a test."/>
> > <cas:View sofa="1" members="18"/>
> > </xmi:XMI>
> >
> >
> > CAS2.xmi
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <xmi:XMI
> > xmlns:cas="http:///uima/cas.ecore"
> > xmlns:tcas="http:///uima/tcas.ecore"
> > xmlns:xmi="http://www.omg.org/XMI"
> > xmlns:ls="http:///com/example.ecore"
> > xmi:version="2.0">
> > <cas:NULL xmi:id="0"/>
> > <ls:DocumentMetadata xmi:id="18" source="file001.txt" documentId="001"/>
> > <cas:Sofa xmi:id="1" sofaNum="1" sofaID="_InitialView" mimeType="text"
> > sofaString="This is a test."/>
> > <cas:View sofa="1" members="18"/>
> > </xmi:XMI>
> >
> >
> > TYPESYSTEM
> >
> > <?xml version="1.0" encoding="UTF-8" ?>
> >
> > <typeSystemDescription  xmlns="http://uima.apache.org/resourceSpecifier
> ">
> >         <name>ExampleTypeSystem</name>
> >         <description>Just an example</description>
> >         <vendor>example.com</vendor>
> >         <version>1.0</version>
> >         <types>
> >                 <typeDescription>
> >                         <name>com.example.DocumentMetadata</name>
> >                         <description></description>
> >
> > <supertypeName>uima.tcas.DocumentAnnotation</supertypeName>
> >                         <features>
> >                                 <featureDescription>
> >                                         <name>source</name>
> >                                         <description>Source</description>
> >
> > <rangeTypeName>uima.cas.String</rangeTypeName>
> >                                 </featureDescription>
> >                                 <featureDescription>
> >                                         <name>documentId</name>
> >                                         <description>Source</description>
> >
> > <rangeTypeName>uima.cas.String</rangeTypeName>
> >                                 </featureDescription>
> >                         </features>
> >                 </typeDescription>
> >
> >         </types>
> > </typeSystemDescription>
> >
> >
>
>


-- 

Pablo N. Mendes
http://pablomendes.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message