Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 97086200D46 for ; Sun, 26 Nov 2017 19:57:07 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8C13D160BFF; Sun, 26 Nov 2017 18:57:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D1B89160BEE for ; Sun, 26 Nov 2017 19:57:06 +0100 (CET) Received: (qmail 59709 invoked by uid 500); 26 Nov 2017 18:57:06 -0000 Mailing-List: contact notifications-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ctakes.apache.org Delivered-To: mailing list notifications@ctakes.apache.org Received: (qmail 59700 invoked by uid 99); 26 Nov 2017 18:57:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Nov 2017 18:57:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 27A10C426D for ; Sun, 26 Nov 2017 18:57:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.811 X-Spam-Level: X-Spam-Status: No, score=-99.811 tagged_above=-999 required=6.31 tests=[KB_WAM_FROM_NAME_SINGLEWORD=0.2, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id FL4CJgkeEWEk for ; Sun, 26 Nov 2017 18:57:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id D6E945F19B for ; Sun, 26 Nov 2017 18:57:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D69F7E00CD for ; Sun, 26 Nov 2017 18:57:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id CA68F23F1D for ; Sun, 26 Nov 2017 18:57:01 +0000 (UTC) Date: Sun, 26 Nov 2017 18:57:00 +0000 (UTC) From: "Sean Finan (JIRA)" To: notifications@ctakes.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Closed] (CTAKES-485) Add Thread safe default clinical pipeline MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 26 Nov 2017 18:57:07 -0000 [ https://issues.apache.org/jira/browse/CTAKES-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Finan closed CTAKES-485. ----------------------------- Resolution: Implemented This implementation is thread safe, but not highly concurrent. What does that mean? A lot of thread blocking. So, larger pipelines and longer notes will see greater performance because threads are less likely to be attempting to use the same annotation engine. For instance, see below. The default clinical pipeline sees ~25% improvement in performance going from 1 to 2 threads. Going to 3 threads see no improvement over 2. For a much longer "full" pipeline, adding a 3rd thread sees another 6-7% improvement. Things like disk i/o further contribute to the decreasing gain, but it is mostly thread contention. What we really need is to make each individual annotator more concurrent, reducing or removing the amount of code that needs to be in synchronized blocks. Just in case you want to test this, please do not think that you will get your best performance by "using all of your cores." Use your core count -1. On my old HP EliteBook 8440p; 64bit, (2) 2.67 Ghz proc, hyperthreaded (4 core), 6GB RAM, Windows 7(64b) Processing time for notes in ctakes-examples, averaging over 3 runs each: Default Clinical single: 0:44 100% 2proc: 0:32 73% 3proc: 0:32 73% Full Pipeline (sections, paragraphs, lists, [default clinical], degree, location, event, time, e-t, e-e links, coref) single: 4:04 100% 2proc: 2:55 72% 3proc: 2:42 66% > Add Thread safe default clinical pipeline > ----------------------------------------- > > Key: CTAKES-485 > URL: https://issues.apache.org/jira/browse/CTAKES-485 > Project: cTAKES > Issue Type: New Feature > Affects Versions: 4.0.1 > Reporter: Sean Finan > Assignee: Sean Finan > Priority: Minor > Labels: performance > Fix For: 4.0.1 > > > cTakes is not thread-safe. This has been well established. It would be nice if at least the default clinical pipeline could be run with some thread safety. -- This message was sent by Atlassian JIRA (v6.4.14#64029)