ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Timothy" <Timothy.Mil...@childrens.harvard.edu>
Subject Re: v_snomed_fword_lookup view
Date Wed, 13 Aug 2014 18:19:13 GMT
There's nothing conceptually special about the consumer model vs.
regular annotators (Analysis Engines). You can write an output format
from any analysis engine as long as it is after the annotations you need
in the pipeline. If you have global constraints (like in an ARFF file I
think you need to know all the CUIs in your corpus to write the
attribute list?), then it is important to use the process() method
[called once per document] to store CUIs in a non-UIMA class variable
(for example, a map from file id to a list/set/multiset of CUIs), and
then use the collectionProcessComplete() [called once after all
documents have been processed] method to do the actual writing of the file.

Hope that is useful, sorry I couldn't tie it in to your previous YTEX
exporter but I'm not familiar with that process.

Tim


On 08/13/2014 02:11 PM, Clayton Turner wrote:
> Oh okay, so is the purpose of a CasConsumer to essentially save your data
> in a representation that you can do some kind of data mining or
> classification on it?  If so, then I think I need to look into making/using
> one of those.
>
>
> On Wed, Aug 13, 2014 at 1:41 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
>> Hi Clayton,
>>
>> I'm glad that you got it working.  Though I stated that I would, I haven't
>> yet checked the fidelity of trunk.  Urgent data request one day, "must
>> have" writing the next ... and I still live with the delusion that I left
>> academia to have free time ...
>>
>> I have never used ytex or weka, so I'm unfamiliar with all things .arff .
>>  Could it be that the ytex .arff exporter needs to change consumed cTakes
>> annotation classes (>3.1)?
>>
>> I have a custom CasConsumer that saves text spans and Cuis to file in a
>> simple list, and that is what I used for the performance analysis of the
>> lookup module.  For our other projects here in Beantown we have other
>> various outputs that fit the job at hand: text flat files, xml files, sql
>> database tables, knot-encoded lace doilies, etc.
>>
>> I'm sure that none of the above helps you, but I felt obliged to provide
>> some kind of answer to your question.
>>
>> Sean
>>
>>> -----Original Message-----
>>> From: clayclay911@gmail.com [mailto:clayclay911@gmail.com] On Behalf Of
>>> Clayton Turner
>>> Sent: Wednesday, August 13, 2014 12:25 PM
>>> To: dev@ctakes.apache.org
>>> Subject: Re: v_snomed_fword_lookup view
>>>
>>> Okay, I believe I have ctakes dictionary fast working now. Something I'm
>> curious
>>> about, though, is how you extract the data in order to conduct analysis.
>>>
>>> I've, in the past, been using the SparseDataExporterImpl from ytex in
>> order to
>>> create a .arff file for use in weka, but the ctakes pipeline I'm using
>> doesn't seem
>>> to be compatible with this ytex exporting as I'm not getting any cuis in
>> my arff
>>> file.
>>>
>>> I'm using the aggregate plain text umls processor analysis engine from
>> ctakes
>>> and then using the dbconsumer analysis engine from ytex (for storing
>> into the
>>> database with regard to analysis batch).
>>>
>>> Any tips for exporting or some simple issue I'm missing?
>>>
>>> Thanks,
>>> Clayton
>>>
>>>
>>> On Mon, Aug 11, 2014 at 2:09 PM, Harpreet Khanduja <hsk5004@rit.edu>
>>> wrote:
>>>
>>>> Yes, absolutely and
>>>> no problem at all.
>>>>
>>>> Regards,
>>>> Harpreet
>>>>
>>>>
>>>> On Mon, Aug 11, 2014 at 1:16 PM, Finan, Sean <
>>>> Sean.Finan@childrens.harvard.edu> wrote:
>>>>
>>>>> Thanks Harpreet,
>>>>> That is definitely necessary to build!
>>>>>
>>>>> Those lines should already be in the pom, but commented out.  I
>>>>> think
>>>> that
>>>>> some version/branching issues may have arisen at some point wrt this
>>>> module
>>>>> ...
>>>>>
>>>>> If somebody beats me to it then cheers, otherwise I will try to
>>>>> check out tonight and get all the bits in place.
>>>>>
>>>>> Sean
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Harpreet Khanduja [mailto:hsk5004@rit.edu]
>>>>>> Sent: Monday, August 11, 2014 1:12 PM
>>>>>> To: dev@ctakes.apache.org
>>>>>> Subject: Re: v_snomed_fword_lookup view
>>>>>>
>>>>>> Hello Clayton,
>>>>>>   I do not know about ytex, but I did switch from
>>>>>> dictionary-lookup to
>>>>> dictionary-
>>>>>> lookup-fast.
>>>>>>   I update my ctakes-dictionary-lookup-fast project using maven.
>>>>>>   I think I used Team- Update and switched to the latest revision
>>>>> available and
>>>>>> then
>>>>>>   I downloaded new 3.2 resources from the for umls. and then I
>>>>>> added
>>>>> these
>>>>>> resources to my
>>>>>>   ctakes-dictionary-lookup-fast resources folder and also the
>>>>>> classpath
>>>>> in ctakes-
>>>>>> clinical-pipeline.
>>>>>>
>>>>>>  Then I changed the pom.xml file which belongs to the whole ctakes
>>>>> project and
>>>>>> added <dependency> <groupId>org.apache.ctakes</groupId>
>>>>>> <artifactId>ctakes-dictionary-lookup-res</artifactId>
>>>>>> <version>${ctakes.version}</version>
>>>>>> </dependency>
>>>>>> <dependency>
>>>>>> <groupId>org.apache.ctakes</groupId>
>>>>>> <artifactId>ctakes-dictionary-lookup-fast</artifactId>
>>>>>> <version>${ctakes.version}</version>
>>>>>> </dependency>
>>>>>>
>>>>>>
>>>>>>  these two dependencies to the file.
>>>>>>
>>>>>>
>>>>>> After this, I also added the dependency
>>>>>>     <dependency>
>>>>>> <groupId>org.apache.ctakes</groupId>
>>>>>> <artifactId>ctakes-dictionary-lookup-fast</artifactId>
>>>>>> </dependency>
>>>>>>
>>>>>> to the pom.xml of ctakes-clinical-pipeline.
>>>>>>
>>>>>> And then add the resources folder in ctakes-clinical-pipeline
>>>>>> using
>>>>> build path
>>>>>> configuration under "add class" option.
>>>>>>
>>>>>> After this it should work.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Harpreet
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Aug 11, 2014 at 12:44 PM, Clayton Turner
>>>>>> <caturner3@g.cofc.edu
>>>>>> wrote:
>>>>>>
>>>>>>> I still get the same error with the ctakes3.2 branch. Any
>>>> suggestions?
>>>>>>>
>>>>>>> On Mon, Aug 11, 2014 at 12:06 PM, Clayton Turner
>>>>>>> <caturner3@g.cofc.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm going to do a clean install through the repo rather than
>>>>>>>> the binaries and see if that fixes my issue because I think
I
>>>>>>>> just read a past post saying the lookup2 folders exist there.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Aug 11, 2014 at 11:52 AM, Clayton Turner
>>>>>>>> <caturner3@g.cofc.edu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> When navigating to
>>>>>>>>> ctakes-dictionary-lookup-fast\desc\analysis_engine
>>>>>>>>> there are 2 files, assumedly analysis engines.
>>>>>>>>>
>>>>>>>>> SnomedLookupAnnotator.xml and SnomedOvLookupAnnotator.xml
>>>>>>>>>
>>>>>>>>> If I pick either, I put in my UMLS information but receive
an
>>>> error
>>>>>>>>> when trying to run the CPE:
>>>>>>>>>
>>>>>>>>> Initialization of CAS Processor with name
>>>> "SnomedOvLookupAnnotator"
>>>>>>>>> failed.
>>>>>>>>> CausedBy:
>>> org.apache.uima.resource.ResourceConfigurationException:
>>>>>>>>> Initialization of CAS processor with name
>>>> "SnomedOvLookupAnnotator"
>>>>>>>>> failed.
>>>>>>>>> CausedBy:
>>>> org.apache.uima.resource.ResourceInitializationException:
>>>>>>> Error
>>>>>>>>> initializing "org.apache.uima.resource.impl.DataResource_impl"
>>>> from
>>>>>>>>> descriptor file:..............SnomedLookupAnnotator.xml
>>>>>>>>> CausedBy:
>>>> org.apache.uima.resource.ResourceInitializationException:
>>>>>>> Could
>>>>>>>>> not
>>>>>>>>> access the resource data at
>>>>>>>>>
>>>>>>>>>
>>>> file:org\apache\ctakes\dictionary\lookup2\Snomed2011ab_ctakesTui\cTake
>>>>>>> sSnomed.xml
>>>>>>>>> Now, I don't even have a "lookup2" folder and, subsequently
>>>>>>>>> the
>>>> Tui
>>>>>>>>> folder and cTakesSnomed.xml file. This seems to be the
>>>>>>>>> problem,
>>>> but
>>>>>>>>> I'm
>>>>>>> not
>>>>>>>>> sure where these files are supposed to be grabbed from.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Aug 11, 2014 at 11:47 AM, Clayton Turner
>>>>>>>>> <caturner3@g.cofc.edu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi again:
>>>>>>>>>>
>>>>>>>>>> How exactly do you switch to using the cTakes
>>>>> dictionary-lookup-fast.
>>>>>>> Do
>>>>>>>>>> I need to go in and alter xml files or is it as simple
as
>>>>>>>>>> adding
>>>> a
>>>>>>> certain
>>>>>>>>>> item to the list of analysis engines?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 8, 2014 at 3:48 PM, Finan, Sean <
>>>>>>>>>> Sean.Finan@childrens.harvard.edu> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Clayton,
>>>>>>>>>>>
>>>>>>>>>>> I don't know how the ytex dictionary lookup works,
so I'm
>>>>>>>>>>> afraid that
>>>>>>> I
>>>>>>>>>>> can't help you with an answer.  Maybe Vijay is
the best
>>>>>>>>>>> person
>>>> to
>>>>>>>>>>> do
>>>>>>> this.
>>>>>>>>>>>  If you aren't tied to ytex you could try the
new cTakes
>>>>>>>>>>> dictionary-lookup-fast.  I tested "Patient came
in with a
>>>>>>>>>>> malar
>>>>> rash"
>>>>>>> and
>>>>>>>>>>> it found "malar" and "malar rash".
>>>>>>>>>>>
>>>>>>>>>>> Vijay,
>>>>>>>>>>>
>>>>>>>>>>> At some point the lookup-fast module will be
the default
>>>>>>>>>>> for the
>>>>>>> cTakes
>>>>>>>>>>> clinical pipeline.  In order to synchronize the
ytex lookup
>>>>>>>>>>> with
>>>>>>> cTakes,
>>>>>>>>>>> would you like to eventually work together on
reusing the
>>>>>>>>>>> same code
>>>>>>> for
>>>>>>>>>>> ytex?  I have no idea what ytex does, but I know
the ins
>>>>>>>>>>> and
>>>> outs
>>>>>>>>>>> of
>>>>>>> the
>>>>>>>>>>> cdl-fast module.
>>>>>>>>>>>
>>>>>>>>>>> Sean
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: clayclay911@gmail.com
>>>>>>>>>>>> [mailto:clayclay911@gmail.com] On
>>>>>>> Behalf
>>>>>>>>>>> Of
>>>>>>>>>>>> Clayton Turner
>>>>>>>>>>>> Sent: Friday, August 08, 2014 2:08 PM
>>>>>>>>>>>> To: dev@ctakes.apache.org
>>>>>>>>>>>> Subject: v_snomed_fword_lookup view
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Everyone:
>>>>>>>>>>>>
>>>>>>>>>>>> I have a question about how the v_snomed_fword_lookup
>>>>>>>>>>>> view works
>>>>>>> when
>>>>>>>>>>>> running the CPE.
>>>>>>>>>>>>
>>>>>>>>>>>> So my understanding of the view is that it
is a view
>>>>>>>>>>>> comprised of
>>>>>>> the
>>>>>>>>>>>> ytex.umls_aui_fword table, the umls.mrconso
table and
>>>>>>>>>>>> bits/pieces
>>>>>>> from
>>>>>>>>>>>> other umls tables.
>>>>>>>>>>>>
>>>>>>>>>>>> I feel like this is not completely correct
or my idea of
>>>>>>>>>>>> how the
>>>>>>> join
>>>>>>>>>>> to
>>>>>>>>>>>> create the view works is off. For example,
let's say I
>>>>>>>>>>>> want
>>>> the
>>>>>>>>>>>> CPE
>>>>>>>>>>> to find
>>>>>>>>>>>> "malar ____" (e.g. malar rash) as a concept
in the
>>>> annotations.
>>>>>>>>>>>> It
>>>>>>>>>>> never
>>>>>>>>>>>> happens after running my CPE descriptor and
I cannot find
>>>>>>>>>>>> it
>>>> in
>>>>>>>>>>>> my v_snomed_fword_lookup view.
>>>>>>>>>>>>
>>>>>>>>>>>> select count(*) from umls_aui_fword where
fword='malar';
>>>> yields
>>>>>>>>>>>> 34
>>>>>>>>>>> results
>>>>>>>>>>>> select count(*) from umls.mrconso where str='malar';
>>>>>>>>>>>> yields 3
>>>>>>> results.
>>>>>>>>>>>> So clearly these two tables know what the
cui and
>>>>>>>>>>>> context(s) are for
>>>>>>>>>>> malar
>>>>>>>>>>>> ____. Yet, whenever I run a gold standard
set of notes
>>>>>>>>>>>> through the
>>>>>>>>>>> CPE,
>>>>>>>>>>>> malar is constantly flagged as just a word
token and the
>>>>>>>>>>>> concept is
>>>>>>>>>>> never
>>>>>>>>>>>> grabbed. This is recurrent for lots of other
concepts, as
>>>> well,
>>>>>>>>>>>> I
>>>>>>> just
>>>>>>>>>>>> wanted to use an example to illustrate my
issue.
>>>>>>>>>>>>
>>>>>>>>>>>> Some troubleshooting I already went through:
>>>>>>>>>>>> 1) Reinstalled ytex and umls database objects
>>>>>>>>>>>> 2) Reinstalled a second time after redownloading
umls
>>>>>>>>>>>> through metamorphosys, ensuring that snomed
vocabularies
>>>>>>>>>>>> were included (also checked file sizes and
noticed a big
>>>>>>>>>>>> difference so I
>>>> know
>>>>>>>>>>>> those vocabularies ARE included
>>>>>>>>>>>>
>>>>>>>>>>>> Anyone got any ideas as to what the issue
could be?
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you,
>>>>>>>>>>>> Clayton Turner
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> --
>>>>>>>>>> Clayton Turner
>>>>>>>>>> email: caturner3@g.cofc.edu
>>>>>>>>>> phone: (843)-424-3784
>>>>>>>>>> web: claytonturner.blogspot.com
>>>>>>>>>>
>>>>>>>>>>
>>>> ----------------------------------------------------------------------
>>>>>>> ---------------------------
>>>>>>>>>> “When scientifically investigating the natural
world, the
>>>>>>>>>> only thing worse than a blind believer is a seeing
denier.”
>>>>>>>>>> - Neil deGrasse Tyson
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> --
>>>>>>>>> Clayton Turner
>>>>>>>>> email: caturner3@g.cofc.edu
>>>>>>>>> phone: (843)-424-3784
>>>>>>>>> web: claytonturner.blogspot.com
>>>>>>>>>
>>>>>>>>>
>>>> ----------------------------------------------------------------------
>>>>>>> ---------------------------
>>>>>>>>> “When scientifically investigating the natural world,
the
>>>>>>>>> only thing worse than a blind believer is a seeing denier.”
>>>>>>>>> - Neil deGrasse Tyson
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Clayton Turner
>>>>>>>> email: caturner3@g.cofc.edu
>>>>>>>> phone: (843)-424-3784
>>>>>>>> web: claytonturner.blogspot.com
>>>>>>>>
>>>>>>>>
>>>> ----------------------------------------------------------------------
>>>>>>> ---------------------------
>>>>>>>> “When scientifically investigating the natural world, the
only
>>>> thing
>>>>>>> worse
>>>>>>>> than a blind believer is a seeing denier.”
>>>>>>>> - Neil deGrasse Tyson
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> --
>>>>>>> Clayton Turner
>>>>>>> email: caturner3@g.cofc.edu
>>>>>>> phone: (843)-424-3784
>>>>>>> web: claytonturner.blogspot.com
>>>>>>>
>>>>>>>
>>>> ----------------------------------------------------------------------
>>>>>>> --------------------------- “When scientifically investigating
>>>>>>> the natural world, the only thing worse than a blind believer
is
>>>>>>> a seeing denier.”
>>>>>>> - Neil deGrasse Tyson
>>>>>>>
>>>
>>>
>>> --
>>> --
>>> Clayton Turner
>>> email: caturner3@g.cofc.edu
>>> phone: (843)-424-3784
>>> web: claytonturner.blogspot.com
>>>
>> -------------------------------------------------------------------------------------------------
>>> “When scientifically investigating the natural world, the only thing
>> worse than a
>>> blind believer is a seeing denier.”
>>> - Neil deGrasse Tyson
>
>


Mime
View raw message