ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Raj <abhishe...@iitrpr.ac.in>
Subject Re: API for running ctakes programatically
Date Wed, 18 Jun 2014 15:11:11 GMT
There's one more little thing that I needed advice on. I am annotating
clinical documents to find out Disorder Mentions in the document. I am
aware that ctakes does that to some extent. What would be the best way to
find Disorder Mentions in clinical documents? I am currently using the
AggregatePlainTextUMLSProcessor for the same. Is there any other analysis
engine that does the job better? How can I get more accurate annotations
for Disorder Mentions in the clinical documents? Thanks! :)


On Wed, Jun 18, 2014 at 8:35 PM, Abhishek Raj <abhishekrm@iitrpr.ac.in>
wrote:

> Thanks a lot for your replies. CPE did the job for me. I used it with the
> "test_plaintext.xml" CPE descriptor and "AggregatePlainTextUmlsProcessor"
> as the Analysis Engine. Gave the path to input directory and gave a custom
> output directory for writing CAS to XML file and that did it for me! Now I
> have the annotation for each input file stored in an XML file in the output
> directory. :)
>
>
> On Wed, Jun 18, 2014 at 8:03 PM, Pei Chen <chenpei@apache.org> wrote:
>
>> Also check out the main class in:
>>
>> https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-clinical-pipeline/src/main/java/org/apache/ctakes/clinicalpipeline/ClinicalPipelineFactory.java
>> It uses uimaFIT style to programmatically wire up a pipeline and one can
>> also use uimaFIT to access the Annotations (TypeSystem).
>>
>> --Pei
>>
>>
>> On Wed, Jun 18, 2014 at 10:16 AM, vijay garla <vngarla@gmail.com> wrote:
>>
>>> To Annotate:
>>> If you have a CPE, and all the components in your pipeline are
>>> threadsafe (i.e. drop LVG from your pipeline), you can increase the threads
>>> in the cpe config
>>> You can use this class: org.apache.ctakes.ytex.tools.RunCPE to run a
>>> cpe from the command line/script
>>>
>>> Alternatively, run multiple CPE's in parallel (they need to be
>>> processing different subsets of the corpus)
>>>
>>> To extract annotations:
>>> Add the YTEX DBConsumer to store the annotations in a database (see
>>> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1.2+-+YTEX+DBConsumer
>>> )
>>> Make sure you configure 'types to ignore' - you don't want to store
>>> annotations for punctuation.
>>>
>>> You can add the DBConsumer to any pipeline/CPE - you don't need any
>>> other YTEX components (however, you do have to set up a database).
>>>
>>>
>>>
>>>
>>> On Wed, Jun 18, 2014 at 4:56 AM, Richard Eckart de Castilho <
>>> rec@apache.org> wrote:
>>>
>>>> A Groovy script has been mentioned on the developers list that
>>>> illustrates how to use uimaFIT to compose and run a cTAKES pipeline. [1]
>>>>
>>>> I do not know if these scripts are only in SVN or if they are (planned
>>>> to) be part of a release or of some documentation.
>>>>
>>>> Cheers,
>>>>
>>>> -- Richard
>>>>
>>>> [1]
>>>> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201312.mbox/%3C996FC801C05DF64A84246A106FACACD021A0DA@MSGPEXCHA08A.mfad.mfroot.org%3E
>>>>
>>>> On 18.06.2014, at 07:33, Abhishek Raj <abhishekrm@iitrpr.ac.in> wrote:
>>>>
>>>> > Hello. I have been looking for a way to run ctakes programatically to
>>>> annotate large number of documents and extract those annotations. I haven't
>>>> come across any docs so far which explains how to do that. If someone could
>>>> throw some light on this issue, it'd be great. Thanks! :)
>>>>
>>>>
>>>
>>
>

Mime
View raw message