Return-Path: Delivered-To: apmail-incubator-uima-user-archive@minotaur.apache.org Received: (qmail 4879 invoked from network); 17 Feb 2010 10:21:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Feb 2010 10:21:49 -0000 Received: (qmail 432 invoked by uid 500); 17 Feb 2010 10:21:48 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 373 invoked by uid 500); 17 Feb 2010 10:21:47 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Received: (qmail 363 invoked by uid 99); 17 Feb 2010 10:21:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 10:21:47 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of twgoetz@gmx.de designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 17 Feb 2010 10:21:39 +0000 Received: (qmail invoked by alias); 17 Feb 2010 10:21:17 -0000 Received: from deibp9eh1--blueice3n1.emea.ibm.com (EHLO [9.152.14.84]) [195.212.29.179] by mail.gmx.net (mp052) with SMTP; 17 Feb 2010 11:21:17 +0100 X-Authenticated: #25330878 X-Provags-ID: V01U2FsdGVkX1/3YT1pYGf3rXn78FE6PDvHXqaMob8pypSe+XvxSx 3Wt00hkipIZvMz Message-ID: <4B7BC31C.2030607@gmx.de> Date: Wed, 17 Feb 2010 11:21:16 +0100 From: Thilo Goetz User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.7) Gecko/20100111 Lightning/1.0b1 Thunderbird/3.0.1 MIME-Version: 1.0 To: uima-user@incubator.apache.org Subject: Re: Status Tika Annotator References: <4B7ACA2B.8070708@gmx.de> In-Reply-To: X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-FuHaFi: 0.53000000000000003 The simplest explanation for what you're seeing would be that the type system is somehow not loaded. The annotator requests a type object from the the type system, and doesn't check that it's not null. The type object is then handed to createFS(), which barfs on a null type. To verify, put a break point on line 165 of MarkupHandler. attributeType must not be null. If it's null, that probably means the type system is not loaded. Did you change the descriptors? Any error messages on startup, in the UIMA log? (That code is using a mixture of JCas and plain old CAS, which I don't quite understand. That should have no bearing on your issue, however.) --Thilo On 2/16/2010 19:39, Roland Cornelissen wrote: > Hi, > >> version of the code , > I use > UIMA 2.30 > Tika Annotator from the UIMA-Annotator-addons package (2.3.0) > Tika 0.6 >> and send us the the full stack trace. > java.lang.NullPointerException > at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:474) > at org.apache.uima.tika.MarkupHandler.populateCAS(MarkupHandler.java:165) > at > org.apache.uima.tika.TIKAWrapper.populateCASfromURL(TIKAWrapper.java:105) > at > org.apache.uima.tika.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:99) > at > org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.readNext(ArtifactProducer.java:494) > at > org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.run(ArtifactProducer.java:711) > java.lang.NullPointerException > at org.apache.uima.cas.impl.CASImpl.createFS(CASImpl.java:474) > at org.apache.uima.tika.MarkupHandler.populateCAS(MarkupHandler.java:165) > at > org.apache.uima.tika.TIKAWrapper.populateCASfromURL(TIKAWrapper.java:105) > at > org.apache.uima.tika.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:99) > at > org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.readNext(ArtifactProducer.java:494) > at > org.apache.uima.collection.impl.cpm.engine.ArtifactProducer.run(ArtifactProducer.java:711) > uima.tcas.DocumentAnnotation [Dolphin] AdditionalInfo=31 SortOrder=1 > Timestamp=2010,2,16,13,42,48 ViewMode=1 > org.apache.uima.tika.MarkupAnnotation [Dolphin] AdditionalInfo=31 > SortOrder=1 Timestamp=2010,2,16,13,42,48 ViewMode=1 > org.apache.uima.tika.MarkupAnnotation > org.apache.uima.tika.SourceDocumentAnnotation > org.apache.uima.tika.MarkupAnnotation > org.apache.uima.tika.MarkupAnnotation [Dolphin] AdditionalInfo=31 > SortOrder=1 Timestamp=2010,2,16,13,42,48 ViewMode=1 > > > I have simple testsetup where output is writen to an annotation writer: in > this case Tika reads 3 html pages, errors on the first 2 and passes the > annotations from the last (?). Last lines of the stack trace are the printed > annotations. > > I hope this is better info. > > Roland > >