ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: umls lookup issue
Date Fri, 16 Aug 2013 02:23:50 GMT
Hi Samir,
Do you have a sample sentence that causes the 3hr run?
Also could you attach the AggregatePipeline.xml configuration used? In case, someone else
on the dev list may have encountered this in the past already.

I'll try and see if I can recreate it.
--Pei
________________________________
From: samir chabou [samirchb@yahoo.com]
Sent: Thursday, August 15, 2013 7:07 PM
To: Chen, Pei
Subject: Re: umls lookup issue

Hi Pei,
we did more debuging and it's the lookup call below (higlighted in yelleow) that causes the
delay.

performLookup is in DictionaryLookupAnnotator.java

private void performLookup(JCas jcas, LookupSpec ls, List lookupTokenList,
                    Map ctxMap) throws Exception
       {
             // sort the lookup tokens
             Collections.sort(lookupTokenList, LookupTokenComparator.getInstance() );

             // perform lookup
             Collection lookupHitCol = null;

             LookupAlgorithm la = (LookupAlgorithm) ls.getLookupAlgorithm();
             lookupHitCol = la.lookup(lookupTokenList, ctxMap);

Samir



________________________________
From: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>
Cc: samir chabou <samirchb@yahoo.com>
Sent: Thursday, August 15, 2013 9:00:37 AM
Subject: RE: umls lookup issue

Hi Samir,
[including the public dev list]
Thanks for opening up a new thread on this issue.
Would you be able to help narrow down the sentence that you believe is causing the NP2LookupWindow
to take 3h to process?  I can’t seem to reproduce it on my end.
I vaguely remember someone running into something where it could go into a loop, so hopefully
maybe they can also chime in…

--Pei

From: samir chabou [mailto:samirchb@yahoo.com]
Sent: Wednesday, August 14, 2013 7:30 PM
To: Chen, Pei
Subject: Re: umls lookup issue


specifically the NP2LookupWindow that causes de delay
________________________________
From: samir chabou <samirchb@yahoo.com>
To: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
Sent: Wednesday, August 14, 2013 7:21:18 PM
Subject: Re: umls lookup issue


Hi Pei
I removed the LookupWindowAnnotator went very fast less than 1 min but there was no annotations
for EntityMention and EventMention, it looks there is some thinh wrong with the LookupWindowAnnotator
Samir



________________________________
From: samir chabou <samirchb@yahoo.com>
To: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
Sent: Wednesday, August 14, 2013 7:11:57 PM
Subject: Re: umls lookup issue

Hi Pei
I removed the lookupwindowannotation went very fast less than 1 min but there was no annotations
for EntityMention and EventMention, it looks there is some thinh wrong with the lookupwindowannotation
Samir


________________________________
From: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
To: samir chabou <samirchb@yahoo.com>
Sent: Wednesday, August 14, 2013 3:40:46 PM
Subject: RE: umls lookup issue

That is strange- it shouldn’t take that long.  I wonder if it’s going into an infinite
loop.
Have you tried debugging it?  Perhaps removing some of the lines in the note or removing the
dictionary lookup component itself?
--Pei

From: samir chabou [mailto:samirchb@yahoo.com]
Sent: Wednesday, August 14, 2013 1:14 PM
To: Chen, Pei
Subject: Re: umls lookup issue

Hi Pei,
Unfortunately, the removal of the DependencyParsser and Assertion did not make difference
(it has been running now for 1h so i stopped). Pei I think the bottle neck was the LookupWindowAnnotator,
yesterday when it was running the console showed the LookupWindowAnnotator annotations it
took quit time to go from one LookupWindow to an other, also these annotations of lookupwindows
was done twice.

Memory: Xms500M and Xmx1500

The jdk : JavaSE-1.6 (jre7)

below screen capture showing from where i got the memory and jdk info + the structure of AggregatePlaintextUMLSProcessor.xml
without the DependencyParsser and Assertion

Thanks a lot
Samir

________________________________
From: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
To: samir chabou <samirchb@yahoo.com>
Sent: Wednesday, August 14, 2013 10:08:00 AM
Subject: RE: umls lookup issue

Hi Samir,
It shouldn’t take 3h… it’s a bit strange.  cTAKES is much more constrained to memory
rather than cpu.  Do you know which JDK and what the java memory settings were used?
Could you also try removing the new annotators that were added in 3.0? DependencyParser, Assertion
Module.  See attached as an example.
--Pei

From: samir chabou [mailto:samirchb@yahoo.com]
Sent: Tuesday, August 13, 2013 10:48 PM
To: Chen, Pei
Subject: Re: umls lookup issue

Hi Pei
I tried the clinical pipeline as is with no modification except for umls username and password,
it took more than 5h on my laptop to process the text sample that i send to you. Then I thought
may be  my laptop was not performing enough so I tried it in on an other laptop i7, 16M, 2.4Mhz
but again it took 3h and plus.  I was wondering if you run it within 5minutes what was the
environment.
Next step as you suggested I will try to create a local db on mysql for the db umls2011ab
and proceed the text. But again it strange that in version cTakes 2.5 this same test took
less than one minute.
Thanks a lot for your cooperation your was appreciated




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message