uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jos Denys <Jos.De...@intersystems.com>
Subject RE: UIMACPP and multi-threading
Date Tue, 05 Apr 2016 12:54:46 GMT
Hi Eddie,

I worked on the CPP-side, and what I noticed was that the JNI Interface always passes an instance
pointer :

JNIEXPORT void JNICALL JAVA_PREFIX(resetJNI) (JNIEnv* jeEnv, jobject joJTaf) {
  try {
    UIMA_TPRINT("entering resetDocument()");

    uima::JNIInstance* pInstance = JNIUtils::getCppInstance(jeEnv, joJTaf);


Now the strange thing, and finally what caused the acces violation error, was that the pInstance
pointer was the same for the 3 threads that (simultaneously) did the UIMA processing,
so it looks like the same CAS was passed for 3 different analysis worker threads.

Any idea why and how this can happen ?

Thanks for your feedback,
Jos Denys,
InterSystems Benelux.


De : Benjamin De Boe
Envoyé : mardi 5 avril 2016 09:33
À : user@uima.apache.org
Cc : Jos Denys <Jos.Denys@intersystems.com>; Chen-Chieh Hsu <Chen-Chieh.Hsu@intersystems.com>
Objet : RE: UIMACPP and multi-threading


Hi Eddie,



Thanks for your prompt response.

In our experiment, we have one initial thread instantiating a CasPool and then passing it
on to newly spawned threads that each have their own DaveDetector instance and fetch a new
CAS from the shared pool. The UimacppEngine objects' cppEnginePointer variable differs per
thread, but on the C++ side, it looks like all threads are pointing to the same memory address
for the CAS they operate on. Given the actions UimacppEngine:process() performs and its cas
being process registered as a protected field rather than a local variable, it's no wonder
it causes trouble.



I can imagine UIMA-AS follows a path that's perhaps slightly different (and apparently safe,
given your test case), but I'm wondering what we're doing wrong that we need to fiddle with
synchronized keywords on the framework classes to ensure we avoid the crash.

Here's our test program. When the CAS pool is small enough (i.e. 5), things work fine. When
it is larger than the number of documents we want to process (23), it also works. When it
is somewhere in between (i.e. 20), we get the crash.



package com.intersys.uima.test;



import java.io.File;

import java.net.URL;

import java.net.URLClassLoader;

import org.apache.uima.UIMAFramework;

import org.apache.uima.analysis_engine.AnalysisEngine;

import org.apache.uima.cas.CAS;

import org.apache.uima.resource.ResourceSpecifier;

import org.apache.uima.util.CasCreationUtils;

import org.apache.uima.util.CasPool;

import org.apache.uima.util.Level;

import org.apache.uima.util.XMLInputSource;



/**

*

* @author bdeboe

*/

public class Standalone implements Runnable {



    private String text;

    private AnalysisEngine ae;

    private CasPool pool;



    public Standalone(String txt, AnalysisEngine ae, CasPool pool) {

        this.text = txt;

        this.ae = ae;

        this.pool = pool;

    }



    public static void main(String[] args) throws Exception {



        String descPath = ((args != null) && (args.length > 0)) ? args[0] : "C:\\InterSystems\\UIMA\\bin\\DaveDetector.xml";

       int casPoolSize = ((args != null) && (args.length > 1)) ? Integer.valueOf(args[1])
: 20;



        XMLInputSource in = new XMLInputSource(descPath);

        ResourceSpecifier specifier

                = UIMAFramework.getXMLParser().parseResourceSpecifier(in);

        AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(specifier);



        String[] text = new String[23];

        // populating the array…

        text[22] = "…";



        CasPool pool = (casPoolSize > 0) ? new CasPool(casPoolSize, ae) : null;

        for (int i = 0; i < text.length; i++) {

            Standalone task = new Standalone(text[i], UIMAFramework.produceAnalysisEngine(specifier),
(casPoolSize > 0) ? pool : null);

            Thread t = new Thread(task);

            t.start();

        }

    }



    @Override

    public void run() {



        CAS cas  = null;

        try {

            if (pool != null) {

                cas = pool.getCas();

            } else {

                cas = CasCreationUtils.createCas(ae.getAnalysisEngineMetaData());

            }



            cas.setDocumentText(text);

            ae.process(cas);



            System.out.println("Done processing text");



        } catch (Exception e) {

            e.printStackTrace();

        } finally {

            if (pool != null) pool.releaseCas(cas);

        }

    }

}





Probably also of note: we sometimes get a simple exception on destroyJNI() (pasted below),
rather than the outright total process crash described earlier. We assume this is just “luck”
in that the different threads are invoking a not-so-critical section.



Apr 05, 2016 9:25:25 AM org.apache.uima.uimacpp.UimacppAnalysisComponent logJTafException

SEVERE: The following internal exception was caught: 5,002 (UIMA_ERR_ENGINE_UNEXPECTED_EXCEPTION)

Apr 05, 2016 9:25:25 AM org.apache.uima.uimacpp.UimacppAnalysisComponent logJTafException(431)

SEVERE:

Error number  : 5002

Recoverable   : No

Error         : Unexpected error

(5002)

org.apache.uima.uimacpp.InternalTafException:

Error number  : 5002

Recoverable   : No

Error         : Unexpected error

(5002)

        at org.apache.uima.uimacpp.UimacppEngine.destroyJNI(Native Method)

        at org.apache.uima.uimacpp.UimacppEngine.destroy(UimacppEngine.java:304)

        at org.apache.uima.uimacpp.UimacppAnalysisComponent.destroy(UimacppAnalysisComponent.java:338)

        at org.apache.uima.uimacpp.UimacppAnalysisComponent.finalize(UimacppAnalysisComponent.java:354)

        at java.lang.System$2.invokeFinalize(System.java:1270)

        at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98)

        at java.lang.ref.Finalizer.access$100(Finalizer.java:34)

        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210)







Many thanks for your feedback,



benjamin





--

Benjamin De Boe | Product Manager

M: +32 495 19 19 27 | T: +32 2 464 97 33

InterSystems Corporation | http://www.intersystems.com



-----Original Message-----

From: Eddie Epstein [mailto:eaepstein@gmail.com]

Sent: Tuesday, April 5, 2016 12:47 AM

To: user@uima.apache.org<mailto:user@uima.apache.org>

Subject: Re: UIMACPP and multi-threading



Hi Benjamin,



UIMACPP is thread safe, as is the JNI interface. To confirm, I just created a UIMA-AS service
with 10 instances of DaveDetector, and fed the service

800 CASes with up to 10 concurrent CASes at any time.



It is not the case with DaveDetector, but at annotator initialization some analytics will
store info in thread local storage, and expect the same thread be used to call the annotator
process method. UIMA-AS and DUCC guarantee that an instantiated AE is always called on the
same thread.



Eddie







On Mon, Apr 4, 2016 at 10:56 AM, Benjamin De Boe < Benjamin.DeBoe@intersystems.com<mailto:Benjamin.DeBoe@intersystems.com>>
wrote:



> Hi,

>

> We're working with a UIMACPP annotator (wrapping our existing NLP

> library) and are running in what appears to be thread safety issues,

> which we can reproduce with the DaveDetector demo AE.

> When separate threads are accessing separate instances of the

> org.apache.uima.uimacpp.UimacppAnalysisComponent wrapper class on the

> Java side, it appears they are invoking the same object on the C++

> side, which results in quite a mess (access violations and process

> crashes) when different threads concurrently invoke resetJNI() and

> fillCASJNI() on the org.apache.uima.uimacpp.UimacppAnalysisComponent

> object. When using a small CAS pool on the Java side, the problem does

> not seem to occur, but it resurfaces if the CAS pool grows bigger and

> memory settings are not increased accordingly. However, if this were a

> pure memory issue, we had hoped to see more telling errors and just

> guessing how big memory should be for larger deployments isn't very appealing an option
either.

> Adding the synchronized keyword to the relevant method of the wrapper

> class on the Java side also avoids the issue, at the obvious cost of

> performance. Moving to UIMA-AS is not an option for us, currently.

>

> Given that the documentation is not explicit about it, we're hoping to

> get an unambiguous answer from this list: is UIMACPP actually supposed

> to be thread-safe? We saw old and resolved JIRA's that addressed

> thread-safety issues for UIMACPP, so we assumed it was the case, but

> reality seems to point in the opposite direction.

>

>

> Thanks in advance for your feedback,

>

> benjamin

>

>

> --

> Benjamin De Boe | Product Manager

> M: +32 495 19 19 27 | T: +32 2 464 97 33 InterSystems Corporation |

> http://www.intersystems.com

>

>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message