ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives
Date Fri, 18 Dec 2015 10:01:37 GMT

sorry, there was no free time left in December for this issue, but I
will be able to provide the patches in January (for real).



Am 24.11.2015 um 11:37 schrieb Azad Dehghan:
> This is on my todo list for Dec. as well. If there are any more volunteers
> for translating JAPE to RUTA, please get in touch.
> Cheers,
> Azad
> On 24 Nov 2015 09:55, "Peter Klügl" <peter.kluegl@averbis.com> wrote:
>> Hi,
>> I just wanted to mention that I haven't forgot about it. Unfortunately,
>> there is just no spare time right now. I hope I will be able to provide
>> the patches in December.
>> Best,
>> Peter
>> Am 06.11.2015 um 16:40 schrieb Pei Chen:
>>> Hi Peter,
>>> I think the ctakes-examples is probably a good starting point at least
>>> in terms of maven modules, etc.  I think it would be good if we use
>>> uimaFIT style as primary approach to wiring components together and
>>> generate desc's as secondary...
>>> I think the actual components that would be required is probably best
>>> left up to what is actually required for best performing c-deid.  The
>>> output would be interesting, I'm not sure if we should treat this as
>>> an independent preprocessing component or part of a pipeline (in which
>>> case, we may need to propose a change to the type system or perhaps an
>>> alternative JCas view.  You can probably open up that discussion to
>>> the dev group as you see fit.)
>>> My 2 cents...
>>> On Fri, Nov 6, 2015 at 3:38 AM, Peter Klügl <peter.kluegl@averbis.com>
> wrote:
>>>> Hi,
>>>> Is there a cTAKES project that may serve as an example on how the
>>>> community develops or how a project should look like?
>>>> I learned that different people set up UIMA project in a quite
> different
>>>> manner and I do not what to get inspired by "some sort of out-dated"
>>>> approach in the cTAKES repo.
>>>> Are there restriction or preferences about the preprocessing components
>>>> that should be used and the kind of "output" of the project.
>>>> Components: On which components may the componetns rely: tokenizer, ...
>>>> parser, ... dict lookup?
>>>> "output": Should the project provide a pipeline or a single AE?
>>>> More comments below.
>>>> Am 03.11.2015 um 16:54 schrieb Azad Dehghan:
>>>>>> Who else plans to provide patches for it? Just to avoid duplicate
> work
>>>>>> and to coordnate the efforts ...
>>>>> I would like to help with the translating JAPE to RUTA.
>>>> You can already go ahead with the UIMA Ruta Workbench if you want, or
>>>> wait until I set up the project with ruta integration.
>>>> If any questions arise, just ask :-)
>>>>>> Is there a development dataset which was utilized for the initial
>>>>>> development, and if yes, is it possible to contribute it too?
>>>>> The data set is unfortunately not publicly available; i2b2
>>>>> <https://www.i2b2.org/NLP/DataSets/Main.php> typically releases
> data
>>>>> sets 12 months after a given challenge; this is done on an individual
> basis
>>>>> and involve a Data Use Agreement.
>>>>> However, I will be able to conduct and coordinate the validation.
>>>> Ok, I'll investigate if we have already access to the dataset here.
>>>>>> My first step would be:
>>>>>> - set up a maven project
>>>>>> - set up a development pipeline in a test (with cTAKES components
>>>>>> replacing the previous ANNIE preprocessing)
>>>>>> But one item that we need to review is the 3rd party libs jars that
>>>>>> were included to ensure compatibility.  I’ll be sure to take a
> at
>>>>>> that over the next few weeks.
>>>>>> —Pei
>>>>> @Pei - once ANNIE components are replaced there is should not be a
> need to
>>>>> worry about the 3rd party libs.
>>>>> Also, just a thought: we may want to create an independent component
> for
>>>>> the Two Pass recognition (TwoPass.java) as this method have shown
> useful
>>>>> for general NER on longitudinal data and surely useful independent of
> the
>>>>> deid component.
>>>>> Cheers,
>>>>> Azad

View raw message