uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: Many views in the cas to serialize cause java.lang.NullPointerException in service uima-as
Date Fri, 17 Feb 2017 19:14:35 GMT
Hi Nelson,

Due to local holidays, you may not see activity here until mid-next week.

Meanwhile, it would help to have a small test case which reproduces this issue.

If you could make one and zip it up and attach it to a Jira, that would enable
great progress :-)

-Marshall


On 2/16/2017 8:52 AM, nelson rivera wrote:
> I tested the solution that proposes, add processParentLast="true" to
> the  Cas Multiplier delegate's configuration. The behavior is the
> same. The other alternative that you say, is that i have a bug
> somewhere in my code which allows a CAS to be accessed in two separate
> threads, but i not have idea,because the cas generated in the second
> cas multipler is composed of all views and then this cas go to the end
> annotator that have only 1 instance, and here finish the flow.
>
> 2017-02-15 16:31 GMT-05:00, Jaroslaw Cwiklik <uimaee@gmail.com>:
>> Nelson, change Cas Multiplier in your deployment descriptor as follows:
>>
>> <analysisEngine key="FileSystemMultiplerCas">
>>                         <casMultiplier poolSize="10"
>> processParentLast="true"/>
>> </analysisEngine>
>>
>> Note: processParentLast="true".
>>
>> In UIMA-AS async aggregate its possible for a child CAS and its parent CAS
>> to flow through the pipeline at the same time and the parent CAS may reach
>> the end before its child(ren). The above setting will ensure the parent CAS
>> does not flow ahead of its children. From UIMA-AS documentation:
>>
>> "The processParentLast attribute on the <casMultiplier> element is
>> optional, and specifies processing order of an input CAS relative to its
>> children. If true, a flow of an input CAS will be suspended after it is
>> returned from a Cas Multiplier delegate until all its child CASes have
>> finished processing. If false, an input CAS can be processed in parallel
>> with its children."
>>
>>
>> If the above change does not fix the NPE, I suspect you may have a bug
>> somewhere in your code which allows a CAS to be accessed in two separate
>> threads.
>>
>> -jerry
>>
>> On Wed, Feb 15, 2017 at 12:43 PM, Jaroslaw Cwiklik <uimaee@gmail.com>
>> wrote:
>>
>>> Nelson, I can try to setup a simple pipeline with one AE which will add
>>> 20
>>> views and than test serialization. Not sure if I get to it today. If not
>>> this will have to wait till Monday next week. I've already mentioned this
>>> before, don't operate on a CAS once it leaves an AE. The contract is
>>> CAS-In
>>> CAS-out. A CAS instance can only be operated on by one AE at a time.
>>>
>>> -jerry
>>>
>>> On Wed, Feb 15, 2017 at 11:06 AM, Marshall Schor <msa@schor.com> wrote:
>>>
>>>> On 2/15/2017 9:51 AM, Jaroslaw Cwiklik wrote:
>>>>> Not exactly sure how to debug this.
>>>> a small-ish test case we could run would enable debugging...
>>>>
>>>>> The UIMA-AS does not touch contents of
>>>>> a CAS directly. Are there any other errors in the log besides NPE? The
>>>>> UIMA-AS uses uima-sdk to serialize CASes. Since you are getting null
>>>> from
>>>>> getView(N), this view must have been deleted somehow.
>>>>>
>>>>> -jerry
>>>>>
>>>>> On Mon, Feb 13, 2017 at 11:43 AM, nelson rivera <
>>>> nelsonrivera12@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I was able to check your email just today. The agregegate is async,
>>>>>> but only process one input CAS at the same time,default
>>>>>> numberOfCASes.
>>>>>> I read your possible explanation but i have no idea that another
>>>>>> thread can modificate the cas, because the last annotator's execution
>>>>>> is correct and only missing that the framework uima-as serializes
the
>>>>>> cas.
>>>>>>
>>>>>> This is the configuration of deploy of the aggregate:
>>>>>>
>>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>> <analysisEngineDeploymentDescription
>>>>>>     xmlns="http://uima.apache.org/resourceSpecifier">
>>>>>>
>>>>>>     <name>XClusterAnalyzerAE Deploy Descriptor</name>
>>>>>>     <description>Deploys XClusterAnalyzerAE</description>
>>>>>>
>>>>>>     <deployment protocol="jms" provider="activemq">
>>>>>>
>>>>>>         <service>
>>>>>>             <inputQueue endpoint="XClusterAnalyzerAggregate"
>>>>>> brokerURL="${defaultBrokerURL}"/>
>>>>>>             <topDescriptor>
>>>>>>                 <import location="./XClusterAnalyzerAggregate.xml"/>
>>>>>>             </topDescriptor>
>>>>>>             <!-- remoteReplyQueueScaleout for remote delegate-->
>>>>>>             <analysisEngine inputQueueScaleout="2"
>>>>>> internalReplyQueueScaleout="3">
>>>>>>                 <delegates>
>>>>>>                     <analysisEngine key="FileSystemMultiplerCas">
>>>>>>                         <casMultiplier poolSize="10"/>
>>>>>>                     </analysisEngine>
>>>>>>                     <analysisEngine key="XFileFormatDetector">
>>>>>>                         <scaleout numberOfInstances="2"/>
>>>>>>                         <asyncAggregateErrorConfiguration>
>>>>>>                             <processCasErrors maxRetries="0"
>>>>>> continueOnRetryFailure="true"/>
>>>>>>                         </asyncAggregateErrorConfiguration>
>>>>>>                     </analysisEngine>
>>>>>>                     <analysisEngine key="XDataFileExtractor">
>>>>>>                         <scaleout numberOfInstances="2"/>
>>>>>>                         <asyncAggregateErrorConfiguration>
>>>>>>                             <processCasErrors maxRetries="0"
>>>>>> continueOnRetryFailure="true"/>
>>>>>>                         </asyncAggregateErrorConfiguration>
>>>>>>                     </analysisEngine>
>>>>>>                     <remoteAnalysisEngine key="XLanguageDetector">
>>>>>>                         <inputQueue endpoint="XLanguageDetector"
>>>>>> brokerURL="${defaultBrokerURL}"/>
>>>>>>                         <serializer method="xmi"/>
>>>>>>                         <asyncAggregateErrorConfiguration>
>>>>>>                             <processCasErrors maxRetries="0"
>>>>>> continueOnRetryFailure="true"/>
>>>>>>                         </asyncAggregateErrorConfiguration>
>>>>>>                     </remoteAnalysisEngine>
>>>>>>                     <analysisEngine key="XTokenizer">
>>>>>>                         <scaleout numberOfInstances="2"/>
>>>>>>                         <asyncAggregateErrorConfiguration>
>>>>>>                             <processCasErrors maxRetries="0"
>>>>>> continueOnRetryFailure="true"/>
>>>>>>                         </asyncAggregateErrorConfiguration>
>>>>>>                     </analysisEngine>
>>>>>>                     <analysisEngine key="XBoTModeler">
>>>>>>                         <scaleout numberOfInstances="3"/>
>>>>>>                         <asyncAggregateErrorConfiguration>
>>>>>>                             <processCasErrors maxRetries="0"
>>>>>> continueOnRetryFailure="true"/>
>>>>>>                         </asyncAggregateErrorConfiguration>
>>>>>>                     </analysisEngine>
>>>>>>                     <analysisEngine key="MergerInViewCasMultipler">
>>>>>>                         <casMultiplier poolSize="1"/>
>>>>>>                     </analysisEngine>
>>>>>>                     <analysisEngine key="XClusterAnalyzer">
>>>>>>                         <scaleout numberOfInstances="1"/>
>>>>>>                         <asyncAggregateErrorConfiguration>
>>>>>>                             <processCasErrors maxRetries="0"
>>>>>> continueOnRetryFailure="true"/>
>>>>>>                         </asyncAggregateErrorConfiguration>
>>>>>>                     </analysisEngine>
>>>>>>                 </delegates>
>>>>>>             </analysisEngine>
>>>>>>         </service>
>>>>>>     </deployment>
>>>>>>
>>>>>> </analysisEngineDeploymentDescription>
>>>>>>
>>>>>> 2017-02-10 16:43 GMT-05:00, Jaroslaw Cwiklik <uimaee@gmail.com>:
>>>>>>> Just a bit more evidence. The caller of the gerSofaAddr()
>>>>>>>
>>>>>>>     public void writeViewsCommons() throws Exception {
>>>>>>>       // Get indexes for each SofaFS in the CAS
>>>>>>>       int numViews = cas.getBaseSofaCount();
>>>>>>>
>>>>>>>       for (int sofaNum = 1; sofaNum <= numViews; sofaNum++)
{
>>>>>>>         FSIndexRepositoryImpl loopIR = (FSIndexRepositoryImpl)
>>>>>>>  cas.getBaseCAS().getSofaIndexRepository(sofaNum);
>>>>>>>         final int sofaAddr = getSofaAddr(sofaNum);
>>>>>>>
>>>>>>> Not an expert of this code, but it smells like another thread
is
>>>>>> changing a
>>>>>>> CAS which is being serialized.
>>>>>>>
>>>>>>> -jerry
>>>>>>>
>>>>>>> On Fri, Feb 10, 2017 at 4:31 PM, Jaroslaw Cwiklik <uimaee@gmail.com>
>>>>>> wrote:
>>>>>>>> Is this a primitive (single-threaded) aggregate or async
>>>>>>>> (multi-threaded)?
>>>>>>>> If async, try to simplify and run primitive aggregate with
>>>> scaleout=1.
>>>>>>>> The CAS does not seem to be null in this case. The caller
of the
>>>>>>>> getSerializedCas()
>>>>>>>> checks for null.
>>>>>>>>
>>>>>>>> The code dies here:
>>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>>         at org.apache.uima.cas.impl.CasSe
>>>> rializerSupport$CasDocSerializ
>>>>>>>> er.getSofaAddr(CasSerializerSupport.java:454)
>>>>>>>>
>>>>>>>>    public int getSofaAddr(int sofaNum) {
>>>>>>>>       if (sofaNum != 1 || cas.isInitialSofaCreated()) { //skip
if
>>>>>> initial
>>>>>>>> view && no Sofa yet
>>>>>>>>                                                         //
all
>>>>>>>> non-initial-views must have a sofa
>>>>>>>>        * return ((CASImpl)cas.getView(sofaNum)).getSofaRef();*
>>>>>>>>       }
>>>>>>>>       return 0;
>>>>>>>>     }
>>>>>>>>
>>>>>>>> Looks to me that getView(sofaNum) is returning null. Is it
possible
>>>> that
>>>>>>>> two threads are operating on the same CAS maybe? One removing
a
>>>>>>>> view
>>>>>>>> while
>>>>>>>> another trying to serialize. Have no idea what else could
it be.
>>>>>>>>
>>>>>>>> -jerry
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 10, 2017 at 8:45 AM, nelson rivera <
>>>>>> nelsonrivera12@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,  The first thing I did was these tests,i made a simple
test
>>>>>>>>> case
>>>>>>>>> that create a Cas with 17 views and then serialize using
>>>>>>>>> XmiCasSerializer.serialize(newJCas.getCas(), fis) and
serializes
>>>>>>>>> correctly.
>>>>>>>>> Also i made other test, initialize the same AE but of
local way
>>>>>>>>> with
>>>>>>>>> UIMA API and process the same input documents and the
processing
>>>>>>>>> is
>>>>>>>>> correct and then serialize the CAS, without problem.
>>>>>>>>>
>>>>>>>>> The error is with AE deployed in uima-as and consuming
it.
>>>>>>>>>
>>>>>>>>> 2017-02-09 17:30 GMT-05:00, Marshall Schor <msa@schor.com>:
>>>>>>>>>> one thing that would help track this down is a small
isolated
>>>>>>>>>> test
>>>>>>>>>> case.
>>>>>>>>>>
>>>>>>>>>> Do you think uima-as is needed? I'm wondering if
a simple test
>>>>>>>>>> case
>>>>>>>>> which
>>>>>>>>>> generated 17 views and then tried to serialize would
show the
>>>>>>>>>> failure...
>>>>>>>>>>
>>>>>>>>>> If you could supply a small test case that showed
the failure so
>>>>>>>>>> we
>>>>>>>>> could
>>>>>>>>>> reproduce it, that would enable a rapid resolution.
>>>>>>>>>>
>>>>>>>>>> -Marshall
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2/9/2017 3:58 PM, Marshall Schor wrote:
>>>>>>>>>>>  The line throwing the null pointer exception
is :
>>>>>>>>>>>
>>>>>>>>>>> cas.getView(sofaNum).getSofaRef()
>>>>>>>>>>>
>>>>>>>>>>> So the NPE is either the cas is null, or the
getView(sofaNum) is
>>>>>>>>> returning
>>>>>>>>>>> null.
>>>>>>>>>>>
>>>>>>>>>>> I'm not sure what the best way is to debug this...
>>>>>>>>>>>
>>>>>>>>>>> -Marshall
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2/9/2017 12:42 PM, nelson rivera wrote:
>>>>>>>>>>>> I have a aggregate service uima-as, at the
end of aggregate the
>>>> cas
>>>>>>>>>>>> to
>>>>>>>>>>>> return is composed of as many views as the
number of input
>>>>>>>>>>>> files,
>>>>>>>>>>>> each
>>>>>>>>>>>> view with annotations of processing.
>>>>>>>>>>>> With a number of input documents less than
15 the processing is
>>>>>>>>>>>> successful always,
>>>>>>>>>>>> but if the number of documents is greater
than 15, i get a
>>>>>>>>>>>> NullPointerException at the aggregate service
trying to
>>>>>>>>>>>> serialize
>>>>>>>>>>>> the
>>>>>>>>>>>> cas, not in the processing of AE aggregate.
>>>>>>>>>>>> the logs of aggregate service:
>>>>>>>>>>>>
>>>>>>>>>>>> 11:51:38.815 - 42:
>>>>>>>>>>>> cu.datys.xinetica.uima.core.MergerInViewCasMultipler.hasNext
>>>> (285):
>>>>>>>>>>>> INFO: HasNext false
>>>>>>>>>>>> 11:51:38.875 - 44:
>>>>>>>>>>>> org.apache.uima.uimacpp.UimacppAnalysisComponent.log(396):
>>>> INFO: :
>>>>>>>>>>>> XClusterAnalyzer::process --- OK
>>>>>>>>>>>> 11:51:39.145 - 45:
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.replyToClient:
>>>>>>>>>>>> WARNING: Service: XClusterAnalyzerAggregate
Runtime Exception
>>>>>>>>>>>> 11:51:39.145 - 45:
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.replyToClient:
>>>>>>>>>>>> WARNING:
>>>>>>>>>>>> org.apache.uima.aae.error.AsynchAEException:
>>>>>>>>>>>> org.apache.uima.UIMARuntimeException
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSer
>>>>>>>>> ializedCas(JmsOutputChannel.java:1265)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendRe
>>>>>>>>> ply(JmsOutputChannel.java:800)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.sendReplyToRemoteClient(AggregateAnalysisEngineCon
>>>>>>>>> troller_impl.java:2173)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.replyToClient(AggregateAnalysisEngineControl
>>>>>> ler_impl.java:2342)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.finalStep(AggregateAnalysisEngineController_impl.
>>>> java:1862)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.executeFlowStep(AggregateAnalysisEngineController_
>>>>>>>>> impl.java:2489)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.controller.AggregateAnalysisEngineContro
>>>>>>>>> ller_impl.process(AggregateAnalysisEngineController_impl.java:1271)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.handler.HandlerBase.invokeProcess(Handle
>>>>>>>>> rBase.java:118)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.can
>>>>>>>>> celTimerAndProcess(ProcessResponseHandler.java:117)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.han
>>>>>>>>> dleProcessResponseWithCASReference(ProcessResponseHandler.java:485)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.handler.input.ProcessResponseHandler.han
>>>>>>>>> dle(ProcessResponseHandler.java:767)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.handler.HandlerBase.delegate(HandlerBase
>>>>>>>>> .java:149)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.handler.input.ProcessRequestHandler_impl
>>>>>>>>> .handle(ProcessRequestHandler_impl.java:1113)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.spi.transport.vm.UimaVmMessageListener.o
>>>>>>>>> nMessage(UimaVmMessageListener.java:107)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.spi.transport.vm.UimaVmMessageDispatcher
>>>>>>>>> $1.run(UimaVmMessageDispatcher.java:70)
>>>>>>>>>>>>         at
>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>>>>>> Executor.java:1145)
>>>>>>>>>>>>         at
>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>>>>>> lExecutor.java:615)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.UimaAsThreadFactory$1.run(UimaAsThreadFa
>>>>>>>>> ctory.java:132)
>>>>>>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>>>> Caused by: org.apache.uima.UIMARuntimeException
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe
>>>>>>>>> rializer.java:420)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe
>>>>>>>>> rializer.java:385)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.aae.UimaSerializer.serializeCasToXmi(UimaSer
>>>>>>>>> ializer.java:145)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.serial
>>>>>>>>> izeCAS(JmsOutputChannel.java:251)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.adapter.jms.activemq.JmsOutputChannel.getSer
>>>>>>>>> ializedCas(JmsOutputChannel.java:1250)
>>>>>>>>>>>>         ... 18 more
>>>>>>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ
>>>>>>>>> er.getSofaAddr(CasSerializerSupport.java:454)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ
>>>>>>>>> er.writeViewsCommons(CasSerializerSupport.java:465)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer$XmiDocSerializer.
>>>>>>>>> writeViews(XmiCasSerializer.java:572)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.CasSerializerSupport$CasDocSerializ
>>>>>>>>> er.serialize(CasSerializerSupport.java:441)
>>>>>>>>>>>>         at
>>>>>>>>>>>> org.apache.uima.cas.impl.XmiCasSerializer.serialize(XmiCasSe
>>>>>>>>> rializer.java:415)
>>>>>>>>>>>>         ... 22 more
>>>>>>>>>>>>
>>>>


Mime
View raw message