ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Finan (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (CTAKES-485) Add Thread safe default clinical pipeline
Date Sun, 26 Nov 2017 18:57:00 GMT

     [ https://issues.apache.org/jira/browse/CTAKES-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sean Finan closed CTAKES-485.
    Resolution: Implemented

This implementation is thread safe, but not highly concurrent.  What does that mean?  A lot
of thread blocking.  So, larger pipelines and longer notes will see greater performance because
threads are less likely to be attempting to use the same annotation engine.  For instance,
see below.  The default clinical pipeline sees ~25% improvement in performance going from
1 to 2 threads.  Going to 3 threads see no improvement over 2.  For a much longer "full" pipeline,
adding a 3rd thread sees another 6-7% improvement.  Things like disk i/o further contribute
to the decreasing gain, but it is mostly thread contention.  What we really need is to make
each individual annotator more concurrent, reducing or removing the amount of code that needs
to be in synchronized blocks.

Just in case you want to test this, please do not think that you will get your best performance
by "using all of your cores."  Use your core count -1.

On my old HP EliteBook 8440p; 64bit, (2) 2.67 Ghz proc, hyperthreaded (4 core), 6GB RAM, Windows

Processing time for notes in ctakes-examples, averaging over 3 runs each:

Default Clinical
single: 0:44   100%
2proc: 0:32     73%
3proc: 0:32     73%

Full Pipeline (sections, paragraphs, lists, [default clinical], degree, location, event, time,
e-t, e-e links, coref)
single: 4:04   100%
2proc: 2:55     72%
3proc: 2:42     66%

> Add Thread safe default clinical pipeline
> -----------------------------------------
>                 Key: CTAKES-485
>                 URL: https://issues.apache.org/jira/browse/CTAKES-485
>             Project: cTAKES
>          Issue Type: New Feature
>    Affects Versions: 4.0.1
>            Reporter: Sean Finan
>            Assignee: Sean Finan
>            Priority: Minor
>              Labels: performance
>             Fix For: 4.0.1
> cTakes is not thread-safe.  This has been well established.  It would be nice if at least
the default clinical pipeline could be run with some thread safety.

This message was sent by Atlassian JIRA

View raw message