ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Joseph Masanz (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (CTAKES-96) Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and Lee Beecker
Date Mon, 03 Apr 2017 20:23:41 GMT

     [ https://issues.apache.org/jira/browse/CTAKES-96?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Joseph Masanz closed CTAKES-96.
-------------------------------------

Closing this issue that was resolved a while ago.

> Update Dependency Parser and Semantic Role Labeler - Thanks Jinho Choi and Lee Beecker
> --------------------------------------------------------------------------------------
>
>                 Key: CTAKES-96
>                 URL: https://issues.apache.org/jira/browse/CTAKES-96
>             Project: cTAKES
>          Issue Type: New Feature
>    Affects Versions: 3.0-incubating
>            Reporter: Pei Chen
>            Assignee: Pei Chen
>             Fix For: 3.1.0
>
>
> Update/create new wrappers for ClearNLP that have been trained on clinical notes (SHARP/MiPACQ).
> Some notes:
> the integration will be mostly switching to cTAKES types.
> Here are a few critical spots:
> In the tokenizer (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/Tokenizer.java),
lines 96 and 106 are all that should need changing to switch to cTAKES Sentence and Token
types.
> In the pos-tagger (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/PosTagger.java)
most of the changes should be lines 109 and 116-118
> In the MP Analyzer (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/MPAnalyzer.java)
the changes would be lines 122-124 to again use the cTAKES toke types.
> The Dependency Parser (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/DependencyParser.java)
is a bit harder, but similar.  I think you can step through and find instances of ClearTK
types and swap them for the Dependency Relation types in cTAKES.  Basically the code grabs
the token, POS, and lemma data from the CAS and passes it onto Jinho's SRL.  Then the work
is in mapping that output back into CAS appropriate types.
> The Semantic Role Labeler (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/main/java/org/cleartk/clearnlp/SemanticRoleLabeler.java)
follows a similar flow.  But also pulls out Dependency Parse information from the CAS.  Then
the work is in extracting the SRL arguments and predicates to put back into ClearTK CAS types.
> Lastly to get any idea of how these components are called in a UIMA pipeline, I would
refer to the test cases, especailly the ClearNLP test case (https://code.google.com/p/cleartk/source/browse/cleartk-clearnlp/src/test/java/org/cleartk/clearnlp/ClearNLPTest.java)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message