ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject Re: suggestion for default pipelines
Date Sun, 27 Apr 2014 15:52:05 GMT
My vote would be for the latter. Have the "Factory" create pipelines instead. It could just
be a naming thing though...

+1 for building dynamic pipelines. I think this idea has been thrown around for sometime,
but it hasn't been really worked on so it would be cool to see it in action. I think the tricky
part is handling pipeline dependencies- ie. Similar concept to Maven/Ivy. 

Sent from my iPhone

> On Apr 24, 2014, at 5:48 PM, "Miller, Timothy" <Timothy.Miller@childrens.harvard.edu>
wrote:
> 
> Any preference for separate factory classes:
> 
> class SentenceDetectorAnnotatorFactory:
> 
> static AnalysisEngineDescription getSentenceDetectorAnnotator()
> 
> VS
> 
> static methods added to primitive annotators:
> 
> class SentenceDetector (existing)
> 
> static AnalysisEngineDescription getSentenceDetectorAnnotator()
> 
> ?
> 
> The former can clutter up the class space while the latter extends the
> length of classes, especially if there are multiple versions
> (getUMLSDictionaryAnnotator(), getICD9DictionaryAnnotator(),
> getMeshDictionaryAnnotator(), etc.)
> 
> Tim
> 
>> On 04/16/2014 04:48 AM, Richard Eckart de Castilho wrote:
>> It would be nice if uimaFIT provided a Maven plugin to automatically
>> generate descriptors for aggregates. Maybe if we come up with a 
>> convention for factories, e.g. a "class with static methods that do
>> not take any parameters and that return descriptors", or "methods
>> that bear a specific Java annotation, e.g. @AutoGenerateDescriptor)"
>> it should be possible to implement such a Maven plugin.
>> 
>> Cheers,
>> 
>> -- Richard
>> 
>>> On 16.04.2014, at 05:21, Steven Bethard <steven.bethard@gmail.com> wrote:
>>> 
>>> +1. And note that once you have a descriptor, you can generate the
>>> XML, so we should arrange to replace the current XML descriptors with
>>> ones generated automatically from the uimaFIT code. That should reduce
>>> some synchronization problems when the Java code was changed but the
>>> XML descriptor was not.
>>> 
>>> Steve
>>> 
>>> On Tue, Apr 15, 2014 at 8:52 AM, Miller, Timothy
>>> <Timothy.Miller@childrens.harvard.edu> wrote:
>>>> The discussion in the other thread with Abraham Tom gave me an idea I
>>>> wanted to float to the list. We have been using some UIMAFit pipeline
>>>> builders in the temporal project that maybe could be moved into
>>>> clinical-pipeline. For example, look to this file:
>>>> 
>>>> http://svn.apache.org/viewvc/ctakes/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/pipelines/TemporalExtractionPipeline_ImplBase.java?view=markup
>>>> 
>>>> with the static methods getPreprocessorAggregateBuilder() and
>>>> getLightweightPreprocessorAggregateBuilder()   [no umls].
>>>> 
>>>> So my idea would be to create a class in clinical-pipeline
>>>> (CTakesPipelines) with static methods for some standard pipelines (to
>>>> return AnalysisEngineDescriptions instead of AggregateBuilders?):
>>>> 
>>>> getStandardUMLSPipeline()  -- builds pipeline currently in
>>>> AggregatePlaintextUMLSProcessor.xml
>>>> getFullPipeline() -- same as above but with SRL, constituency parsing,
>>>> etc., every component in ctakes
>>>> 
>>>> We could then potentially merge our entry points -- I think Abraham's
>>>> experience points out that this is currently confusing, as well as
>>>> probably not implemented optimally. For example, either
>>>> ClinicalPipelineWithUmls or BagOfCUIsGenerator would use that static
>>>> method to run a uimafit-style pipeline. Maybe we can slowly deprecate
>>>> our xml descriptors too unless people feel strongly about keeping those
>>>> around.
>>>> 
>>>> Another benefit is that the cTAKES API is then trivial -- if you import
>>>> ctakes into your pom file getting a UIMA pipeline is one UimaFit call:
>>>> 
>>>> builder.add(CTAKESPipelines.getStandardUMLSPipeline());
>>>> 
>>>> 
>>>> I think this would actually be pretty easy to implement, but hoping to
>>>> get some feedback on whether this is a good direction.
>>>> 
>>>> Tim
> 
> -- 
> Tim Miller
> Instructor
> Boston Children's Hospital and Harvard Medical School
> timothy.miller@childrens.harvard.edu
> 617-919-1223
> 

Mime
View raw message