uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Status Tika Annotator
Date Wed, 17 Feb 2010 10:21:16 GMT
The simplest explanation for what you're seeing would
be that the type system is somehow not loaded.  The
annotator requests a type object from the the type
system, and doesn't check that it's not null.  The
type object is then handed to createFS(), which
barfs on a null type.  To verify, put a break point
on line 165 of MarkupHandler.  attributeType must
not be null.  If it's null, that probably means the
type system is not loaded.  Did you change the
descriptors?  Any error messages on startup, in the
UIMA log?

(That code is using a mixture of JCas and
plain old CAS, which I don't quite understand.  That
should have no bearing on your issue, however.)

--Thilo

On 2/16/2010 19:39, Roland Cornelissen wrote:
> Hi,
> 
>> version of the code ,
> I use 
> UIMA 2.30 
> Tika Annotator from the UIMA-Annotator-addons package (2.3.0)
> Tika 0.6
>> and send us the the full stack trace.  
> java.lang.NullPointerException
> 	at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:474)
> 	at org.apache.uima.tika.MarkupHandler.populateCAS(MarkupHandler.java:165)
> 	at 
> org.apache.uima.tika.TIKAWrapper.populateCASfromURL(TIKAWrapper.java:105)
> 	at 
> org.apache.uima.tika.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:99)
> 	at 
> org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.readNext(ArtifactProducer.java:494)
> 	at 
> org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.run(ArtifactProducer.java:711)
> java.lang.NullPointerException
> 	at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:474)
> 	at org.apache.uima.tika.MarkupHandler.populateCAS(MarkupHandler.java:165)
> 	at 
> org.apache.uima.tika.TIKAWrapper.populateCASfromURL(TIKAWrapper.java:105)
> 	at 
> org.apache.uima.tika.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:99)
> 	at 
> org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.readNext(ArtifactProducer.java:494)
> 	at 
> org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.run(ArtifactProducer.java:711)
> uima.tcas.DocumentAnnotation   [Dolphin] AdditionalInfo=31 SortOrder=1 
> Timestamp=2010,2,16,13,42,48 ViewMode=1   
> org.apache.uima.tika.MarkupAnnotation   [Dolphin] AdditionalInfo=31 
> SortOrder=1 Timestamp=2010,2,16,13,42,48 ViewMode=1  
> org.apache.uima.tika.MarkupAnnotation  
> org.apache.uima.tika.SourceDocumentAnnotation 
> org.apache.uima.tika.MarkupAnnotation 
> org.apache.uima.tika.MarkupAnnotation [Dolphin] AdditionalInfo=31 
> SortOrder=1 Timestamp=2010,2,16,13,42,48 ViewMode=1 
> 
> 
> I have  simple testsetup where output is writen to an annotation writer: in 
> this case Tika reads 3 html pages, errors on the first 2 and passes the 
> annotations from the last (?). Last lines of the stack trace are the printed 
> annotations.
> 
> I hope this is better info.
> 
> Roland
> 
> 

Mime
View raw message