uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nelson rivera <nelsonriver...@gmail.com>
Subject Re: Many views in the cas to serialize cause java.lang.NullPointerException in service uima-as
Date Thu, 16 Feb 2017 13:52:55 GMT
I tested the solution that proposes, add processParentLast="true" to
the  Cas Multiplier delegate's configuration. The behavior is the
same. The other alternative that you say, is that i have a bug
somewhere in my code which allows a CAS to be accessed in two separate
threads, but i not have idea,because the cas generated in the second
cas multipler is composed of all views and then this cas go to the end
annotator that have only 1 instance, and here finish the flow.

2017-02-15 16:31 GMT-05:00, Jaroslaw Cwiklik <uimaee@gmail.com>:
> Nelson, change Cas Multiplier in your deployment descriptor as follows:
>
> <analysisEngine key="FileSystemMultiplerCas">
>                         <casMultiplier poolSize="10"
> processParentLast="true"/>
> </analysisEngine>
>
> Note: processParentLast="true".
>
> In UIMA-AS async aggregate its possible for a child CAS and its parent CAS
> to flow through the pipeline at the same time and the parent CAS may reach
> the end before its child(ren). The above setting will ensure the parent CAS
> does not flow ahead of its children. From UIMA-AS documentation:
>
> "The processParentLast attribute on the <casMultiplier> element is
> optional, and specifies processing order of an input CAS relative to its
> children. If true, a flow of an input CAS will be suspended after it is
> returned from a Cas Multiplier delegate until all its child CASes have
> finished processing. If false, an input CAS can be processed in parallel
> with its children."
>
>
> If the above change does not fix the NPE, I suspect you may have a bug
> somewhere in your code which allows a CAS to be accessed in two separate
> threads.
>
> -jerry
>
> On Wed, Feb 15, 2017 at 12:43 PM, Jaroslaw Cwiklik <uimaee@gmail.com>
> wrote:
>
>> Nelson, I can try to setup a simple pipeline with one AE which will add
>> 20
>> views and than test serialization. Not sure if I get to it today. If not
>> this will have to wait till Monday next week. I've already mentioned this
>> before, don't operate on a CAS once it leaves an AE. The contract is
>> CAS-In
>> CAS-out. A CAS instance can only be operated on by one AE at a time.
>>
>> -jerry
>>
>> On Wed, Feb 15, 2017 at 11:06 AM, Marshall Schor <msa@schor.com> wrote:
>>
>>> On 2/15/2017 9:51 AM, Jaroslaw Cwiklik wrote:
>>> > Not exactly sure how to debug this.
>>>
>>> a small-ish test case we could run would enable debugging...
>>>
>>> > The UIMA-AS does not touch contents of
>>> > a CAS directly. Are there any other errors in the log besides NPE? The
>>> > UIMA-AS uses uima-sdk to serialize CASes. Since you are getting null
>>> from
>>> > getView(N), this view must have been deleted somehow.
>>> >
>>> > -jerry
>>> >
>>> > On Mon, Feb 13, 2017 at 11:43 AM, nelson rivera <
>>> nelsonrivera12@gmail.com>
>>> > wrote:
>>> >
>>> >> I was able to check your email just today. The agregegate is async,
>>> >> but only process one input CAS at the same time,default
>>> >> numberOfCASes.
>>> >> I read your possible explanation but i have no idea that another
>>> >> thread can modificate the cas, because the last annotator's execution
>>> >> is correct and only missing that the framework uima-as serializes the
>>> >> cas.
>>> >>
>>> >> This is the configuration of deploy of the aggregate:
>>> >>
>>> >> <?xml version="1.0" encoding="UTF-8"?>
>>> >> <analysisEngineDeploymentDescription
>>> >>     xmlns="http://uima.apache.org/resourceSpecifier">
>>> >>
>>> >>     <name>XClusterAnalyzerAE Deploy Descriptor</name>
>>> >>     <description>Deploys XClusterAnalyzerAE</description>
>>> >>
>>> >>     <deployment protocol="jms" provider="activemq">
>>> >>
>>> >>         <service>
>>> >>             <inputQueue endpoint="XClusterAnalyzerAggregate"
>>> >> brokerURL="${defaultBrokerURL}"/>
>>> >>             <topDescriptor>
>>> >>                 <import location="./XClusterAnalyzerAggregate.xml"/>
>>> >>             </topDescriptor>
>>> >>             <!-- remoteReplyQueueScaleout for remote delegate-->
>>> >>             <analysisEngine inputQueueScaleout="2"
>>> >> internalReplyQueueScaleout="3">
>>> >>                 <delegates>
>>> >>                     <analysisEngine key="FileSystemMultiplerCas">
>>> >>                         <casMultiplier poolSize="10"/>
>>> >>                     </analysisEngine>
>>> >>                     <analysisEngine key="XFileFormatDetector">
>>> >>                         <scaleout numberOfInstances="2"/>
>>> >>                         <asyncAggregateErrorConfiguration>
>>> >>                             <processCasErrors maxRetries="0"
>>> >> continueOnRetryFailure="true"/>
>>> >>                         </asyncAggregateErrorConfiguration>
>>> >>                     </analysisEngine>
>>> >>                     <analysisEngine key="XDataFileExtractor">
>>> >>                         <scaleout numberOfInstances="2"/>
>>> >>                         <asyncAggregateErrorConfiguration>
>>> >>                             <processCasErrors maxRetries="0"
>>> >> continueOnRetryFailure="true"/>
>>> >>                         </asyncAggregateErrorConfiguration>
>>> >>                     </analysisEngine>
>>> >>                     <remoteAnalysisEngine key="XLanguageDetector">
>>> >>                         <inputQueue endpoint="XLanguageDetector"
>>> >> brokerURL="${defaultBrokerURL}"/>
>>> >>                         <serializer method="xmi"/>
>>> >>                         <asyncAggregateErrorConfiguration>
>>> >>                             <processCasErrors maxRetries="0"
>>> >> continueOnRetryFailure="true"/>
>>> >>                         </asyncAggregateErrorConfiguration>
>>> >>                     </remoteAnalysisEngine>
>>> >>                     <analysisEngine key="XTokenizer">
>>> >>                         <scaleout numberOfInstances="2"/>
>>> >>                         <asyncAggregateErrorConfiguration>
>>> >>                             <processCasErrors maxRetries="0"
>>> >> continueOnRetryFailure="true"/>
>>> >>                         </asyncAggregateErrorConfiguration>
>>> >>                     </analysisEngine>
>>> >>                     <analysisEngine key="XBoTModeler">
>>> >>                         <scaleout numberOfInstances="3"/>
>>> >>                         <asyncAggregateErrorConfiguration>
>>> >>                             <processCasErrors maxRetries="0"
>>> >> continueOnRetryFailure="true"/>
>>> >>                         </asyncAggregateErrorConfiguration>
>>> >>                     </analysisEngine>
>>> >>                     <analysisEngine key="MergerInViewCasMultipler">
>>> >>                         <casMultiplier poolSize="1"/>
>>> >>                     </analysisEngine>
>>> >>                     <analysisEngine key="XClusterAnalyzer">
>>> >>                         <scaleout numberOfInstances="1"/>
>>> >>                         <asyncAggregateErrorConfiguration>
>>> >>                             <processCasErrors maxRetries="0"
>>> >> continueOnRetryFailure="true"/>
>>> >>                         </asyncAggregateErrorConfiguration>
>>> >>                     </analysisEngine>
>>> >>                 </delegates>
>>> >>             </analysisEngine>
>>> >>         </service>
>>> >>     </deployment>
>>> >>
>>> >> </analysisEngineDeploymentDescription>
>>> >>
>>> >> 2017-02-10 16:43 GMT-05:00, Jaroslaw Cwiklik <uimaee@gmail.com>:
>>> >>> Just a bit more evidence. The caller of the gerSofaAddr()
>>> >>>
>>> >>>     public void writeViewsCommons() throws Exception {
>>> >>>       // Get indexes for each SofaFS in the CAS
>>> >>>       int numViews = cas.getBaseSofaCount();
>>> >>>
>>> >>>       for (int sofaNum = 1; sofaNum <= numViews; sofaNum++) {
>>> >>>         FSIndexRepositoryImpl loopIR = (FSIndexRepositoryImpl)
>>> >>>  cas.getBaseCAS().getSofaIndexRepository(sofaNum);
>>> >>>         final int sofaAddr = getSofaAddr(sofaNum);
>>> >>>
>>> >>> Not an expert of this code, but it smells like another thread is
>>> >> changing a
>>> >>> CAS which is being serialized.
>>> >>>
>>> >>> -jerry
>>> >>>
>>> >>> On Fri, Feb 10, 2017 at 4:31 PM, Jaroslaw Cwiklik <uimaee@gmail.com>
>>> >> wrote:
>>> >>>> Is this a primitive (single-threaded) aggregate or async
>>> >>>> (multi-threaded)?
>>> >>>> If async, try to simplify and run primitive aggregate with
>>> scaleout=1.
>>> >>>>
>>> >>>> The CAS does not seem to be null in this case. The caller of
the
>>> >>>> getSerializedCas()
>>> >>>> checks for null.
>>> >>>>
>>> >>>> The code dies here:
>>> >>>> Caused by: java.lang.NullPointerException
>>> >>>>         at org.apache.uima.cas.impl.CasSe
>>> rializerSupport$CasDocSerializ
>>> >>>> er.getSofaAddr(CasSerializerSupport.java:454)
>>> >>>>
>>> >>>>    public int getSofaAddr(int sofaNum) {
>>> >>>>       if (sofaNum != 1 || cas.isInitialSofaCreated()) { //skip
if
>>> >> initial
>>> >>>> view && no Sofa yet
>>> >>>>                                                         // all
>>> >>>> non-initial-views must have a sofa
>>> >>>>        * return ((CASImpl)cas.getView(sofaNum)).getSofaRef();*
>>> >>>>       }
>>> >>>>       return 0;
>>> >>>>     }
>>> >>>>
>>> >>>> Looks to me that getView(sofaNum) is returning null. Is it possible
>>> that
>>> >>>> two threads are operating on the same CAS maybe? One removing
a
>>> >>>> view
>>> >>>> while
>>> >>>> another trying to serialize. Have no idea what else could it
be.
>>> >>>>
>>> >>>> -jerry
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> On Fri, Feb 10, 2017 at 8:45 AM, nelson rivera <
>>> >> nelsonrivera12@gmail.com>
>>> >>>> wrote:
>>> >>>>
>>> >>>>> Hi,  The first thing I did was these tests,i made a simple
test
>>> >>>>> case
>>> >>>>> that create a Cas with 17 views and then serialize using
>>> >>>>> XmiCasSerializer.serialize(newJCas.getCas(), fis) and serializes
>>> >>>>> correctly.
>>> >>>>> Also i made other test, initialize the same AE but of local
way
>>> >>>>> with
>>> >>>>> UIMA API and process the same input documents and the processing
>>> >>>>> is
>>> >>>>> correct and then serialize the CAS, without problem.
>>> >>>>>
>>> >>>>> The error is with AE deployed in uima-as and consuming it.
>>> >>>>>
>>> >>>>> 2017-02-09 17:30 GMT-05:00, Marshall Schor <msa@schor.com>:
>>> >>>>>> one thing that would help track this down is a small
isolated
>>> >>>>>> test
>>> >>>>>> case.
>>> >>>>>>
>>> >>>>>> Do you think uima-as is needed? I'm wondering if a simple
test
>>> >>>>>> case
>>> >>>>> which
>>> >>>>>> generated 17 views and then tried to serialize would
show the
>>> >>>>>> failure...
>>> >>>>>>
>>> >>>>>> If you could supply a small test case that showed the
failure so
>>> >>>>>> we
>>> >>>>> could
>>> >>>>>> reproduce it, that would enable a rapid resolution.
>>> >>>>>>
>>> >>>>>> -Marshall
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On 2/9/2017 3:58 PM, Marshall Schor wrote:
>>> >>>>>>>  The line throwing the null pointer exception is
:
>>> >>>>>>>
>>> >>>>>>> cas.getView(sofaNum).getSofaRef()
>>> >>>>>>>
>>> >>>>>>> So the NPE is either the cas is null, or the getView(sofaNum)
is
>>> >>>>> returning
>>> >>>>>>> null.
>>> >>>>>>>
>>> >>>>>>> I'm not sure what the best way is to debug this...
>>> >>>>>>>
>>> >>>>>>> -Marshall
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On 2/9/2017 12:42 PM, nelson rivera wrote:
>>> >>>>>>>> I have a aggregate service uima-as, at the end
of aggregate the
>>> cas
>>> >>>>>>>> to
>>> >>>>>>>> return is composed of as many views as the number
of input
>>> >>>>>>>> files,
>>> >>>>>>>> each
>>> >>>>>>>> view with annotations of processing.
>>> >>>>>>>> With a number of input documents less than 15
the processing is
>>> >>>>>>>> successful always,
>>> >>>>>>>> but if the number of documents is greater than
15, i get a
>>> >>>>>>>> NullPointerException at the aggregate service
trying to
>>> >>>>>>>> serialize
>>> >>>>>>>> the
>>> >>>>>>>> cas, not in the processing of AE aggregate.
>>> >>>>>>>> the logs of aggregate service:
>>> >>>>>>>>
>>> >>>>>>>> 11:51:38.815 - 42:
>>> >>>>>>>> cu.datys.xinetica.uima.core.MergerInViewCasMultipler.hasNext
>>> (285):
>>> >>>>>>>> INFO: HasNext false
>>> >>>>>>>> 11:51:38.875 - 44:
>>> >>>>>>>> org.apache.uima.uimacpp.UimacppAnalysisComponent.log(396):
>>> INFO: :
>>> >>>>>>>> XClusterAnalyzer::process --- OK
>>> >>>>>>>> 11:51:39.145 - 45:
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.replyToClient:
>>> >>>>>>>> WARNING: Service: XClusterAnalyzerAggregate
Runtime Exception
>>> >>>>>>>> 11:51:39.145 - 45:
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.replyToClient:
>>> >>>>>>>> WARNING:
>>> >>>>>>>> org.apache.uima.aae.error.AsynchAEException:
>>> >>>>>>>> org.apache.uima.UIMARuntimeException
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSer
>>> >>>>> ializedCas(JmsOutputChannel.java:1265)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendRe
>>> >>>>> ply(JmsOutputChannel.java:800)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.sendReplyToRemoteClient(AggregateAnalysisEngineCon
>>> >>>>> troller_impl.java:2173)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.replyToClient(AggregateAnalysisEngineControl
>>> >> ler_impl.java:2342)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.finalStep(AggregateAnalysisEngineController_impl.
>>> java:1862)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.executeFlowStep(AggregateAnalysisEngineController_
>>> >>>>> impl.java:2489)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>> >>>>> ller_impl.process(AggregateAnalysisEngineController_impl.java:1271)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.handler.HandlerBase.invokeProcess(Handle
>>> >>>>> rBase.java:118)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.can
>>> >>>>> celTimerAndProcess(ProcessResponseHandler.java:117)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.han
>>> >>>>> dleProcessResponseWithCASReference(ProcessResponseHandler.java:485)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.han
>>> >>>>> dle(ProcessResponseHandler.java:767)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.handler.HandlerBase.delegate(HandlerBase
>>> >>>>> .java:149)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl
>>> >>>>> .handle(ProcessRequestHandler_impl.java:1113)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.spi.transport.vm.UimaVmMessageListener.o
>>> >>>>> nMessage(UimaVmMessageListener.java:107)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.spi.transport.vm.UimaVmMessageDispatcher
>>> >>>>> $1.run(UimaVmMessageDispatcher.java:70)
>>> >>>>>>>>         at
>>> >>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> >>>>> Executor.java:1145)
>>> >>>>>>>>         at
>>> >>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> >>>>> lExecutor.java:615)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.UimaAsThreadFactory$1.run(UimaAsThreadFa
>>> >>>>> ctory.java:132)
>>> >>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>> >>>>>>>> Caused by: org.apache.uima.UIMARuntimeException
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe
>>> >>>>> rializer.java:420)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe
>>> >>>>> rializer.java:385)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.aae.UimaSerializer.serializeCasToXmi(UimaSer
>>> >>>>> ializer.java:145)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.serial
>>> >>>>> izeCAS(JmsOutputChannel.java:251)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSer
>>> >>>>> ializedCas(JmsOutputChannel.java:1250)
>>> >>>>>>>>         ... 18 more
>>> >>>>>>>> Caused by: java.lang.NullPointerException
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ
>>> >>>>> er.getSofaAddr(CasSerializerSupport.java:454)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ
>>> >>>>> er.writeViewsCommons(CasSerializerSupport.java:465)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.
>>> >>>>> writeViews(XmiCasSerializer.java:572)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ
>>> >>>>> er.serialize(CasSerializerSupport.java:441)
>>> >>>>>>>>         at
>>> >>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe
>>> >>>>> rializer.java:415)
>>> >>>>>>>>         ... 22 more
>>> >>>>>>>>
>>> >>>>>>
>>> >>>>
>>>
>>>
>>
>

Mime
View raw message