ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pei Chen (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (CTAKES-96) Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and Lee Beecker
Date Wed, 03 Apr 2013 15:25:16 GMT

     [ https://issues.apache.org/jira/browse/CTAKES-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pei Chen reassigned CTAKES-96:
------------------------------

    Assignee: Pei Chen
    
> Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and Lee Beecker
> --------------------------------------------------------------------------------------
>
>                 Key: CTAKES-96
>                 URL: https://issues.apache.org/jira/browse/CTAKES-96
>             Project: cTAKES
>          Issue Type: New Feature
>    Affects Versions: 3.0-incubating
>            Reporter: Pei Chen
>            Assignee: Pei Chen
>             Fix For: 3.1-incubating
>
>
> Update/create new wrappers for ClearNLP that have been trained on clinical notes (SHARP/MiPACQ).
> Some notes:
> the integration will be mostly switching to cTAKES types.
> Here are a few critical spots:
> In the tokenizer (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/Tokenizer.java),
lines 96 and 106 are all that should need changing to switch to cTAKES Sentence and Token
types.
> In the pos-tagger (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/PosTagger.java)
most of the changes should be lines 109 and 116-118
> In the MP Analyzer (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/MPAnalyzer.java)
the changes would be lines 122-124 to again use the cTAKES toke types.
> The Dependency Parser (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/DependencyParser.java)
is a bit harder, but similar.  I think you can step through and find instances of ClearTK
types and swap them for the Dependency Relation types in cTAKES.  Basically the code grabs
the token, POS, and lemma data from the CAS and passes it onto Jinho's SRL.  Then the work
is in mapping that output back into CAS appropriate types.
> The Semantic Role Labeler (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/SemanticRoleLabeler.java)
follows a similar flow.  But also pulls out Dependency Parse information from the CAS.  Then
the work is in extracting the SRL arguments and predicates to put back into ClearTK CAS types.
> Lastly to get any idea of how these components are called in a UIMA pipeline, I would
refer to the test cases, especailly the ClearNLP test case (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/test/java/org/cleartk/clearnlp/ClearNLPTest.java)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message