uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaroslaw Cwiklik <uim...@gmail.com>
Subject Re: UIMA-AS and CasManager.defineCasPool() was called twice by the same Analysis Engine
Date Fri, 19 Jun 2009 17:04:22 GMT
Jorn, there a couple of problems here:

1)
"... Because the AAE is not thread safe uima as must scale it through
creating multiple instances of it..."

Since the AAE is not thread safe you should not try to scale it out in the
same JVM. If AAE
is not thread safe, you should only have one instance of it per JVM. You can
scale it by
starting multiple JVMs.

2)
"...I must admit the documentation confused me a bit about the meaning of
the async attribute..."

The async attribute is only used for aggregates, and specifies that this
aggregate will be run asynchronously (with input queues in front of all of
its delegates) or not. If you choose async="false" it means that you want to
deploy the aggregate synchronously. Meaning it will be single-threaded. To
UIMA AS a synchronous aggregate is the same as a
UIMA primitive AE.

3)            ...
            <analysisEngine key="TextAnalysis" async="false">
                <scaleout numberOfInstances="8" />

                <delegates>
                    <analysisEngine key="HBaseCasMultiplier">
                        <casMultiplier poolSize="8"/>
                    </analysisEngine>
                </delegates>
            </analysisEngine>
            ...

The above is an inconsistent configuration.  You are specifying that
"TextAnalytics" should be deployed synchronously but then adding delegate
configuration, which forces the aggregate to be deployed asynchronously.
Synchronous aggregate delegate's are not "visible" to the uima-as, and
cannot be configured in the deployment descriptor.

The stack trace you've submitted seems incomplete to determine what really
happened.

Regards, Jerry C
On Fri, Jun 19, 2009 at 9:56 AM, Jörn Kottmann <kottmann@gmail.com> wrote:

> Hello everyone,
>
> I have been using uima as already for tagging text with a custom AAE,
> though I did not scaled the AAE because I run in a few issues back then and
> had no time to solve them.
>
> Now I tried again to scale the AAE and failed again. The AAE gets a
> document id
> which is sent to it via uimaj-as-camel component. A cas multiplier then
> fetches the
> actual document out of a database and thats also the component which causes
> trouble.
>
> Because the AAE is not thread safe uima as must scale it through creating
> multiple
> instances of it.
> After reading through the uima as documentation I came up with this
> deployment descriptor:
>           ...
>           <analysisEngine key="TextAnalysis" async="false">
>               <scaleout numberOfInstances="8" />
>
>               <delegates>
>                   <analysisEngine key="HBaseCasMultiplier">
>                       <casMultiplier poolSize="8"/>
>                   </analysisEngine>
>               </delegates>
>           </analysisEngine>
>           ...
>
> I must admit the documentation confused me a bit about the meaning of the
> async attribute.
> Is it correct that async=false means that uima as creates multiple
> instances which are each called
> from one worker thread ? And async=true would then mean that one AE is
> called by multiple threads.
>
> If the numberOfInstacnes is larger then 1 I always get this exception:
> Caused by: org.apache.uima.UIMARuntimeException: The method
> CasManager.defineCasPool() was called twice by the same Analysis Engine
> (/HBaseCasMultiplier/).
>   at
> org.apache.uima.resource.impl.CasManager_impl.defineCasPool(CasManager_impl.java:181)
>   at
> org.apache.uima.resource.impl.CasManager_impl.defineCasPool(CasManager_impl.java:161)
>   at
> org.apache.uima.aae.EECasManager_impl.defineCasPool(EECasManager_impl.java:75)
>   at
> org.apache.uima.impl.UimaContext_ImplBase.getEmptyCas(UimaContext_ImplBase.java:565)
>   at
> org.apache.uima.analysis_component.CasMultiplier_ImplBase.getEmptyCAS(CasMultiplier_ImplBase.java:109)
>   at
> dk.infopaq.nlp.repository.connector.HBaseReadCasMultiplier.hasNext(HBaseReadCasMultiplier.java:107)
>   at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl$AnalysisComponentCasIterator.hasNext(PrimitiveAnalysisEngine_impl.java:563)
>   at
> org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:566)
>   ... 20 more
>
>
> A while back I had a problem which resulted in the same exception message,
> but I was solved by updating UIMA to the current 2.3.0-SNAPSHOT:
> http://www.mail-archive.com/uima-user@incubator.apache.org/msg02054.html
>
> The version I am using is 2.3.0-SNAPSHOT from mid of may.
>
> Thanks,
> Jörn
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message