ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Bethard <steven.beth...@gmail.com>
Subject Re: suggestion for default pipelines
Date Wed, 16 Apr 2014 03:21:05 GMT
+1. And note that once you have a descriptor, you can generate the
XML, so we should arrange to replace the current XML descriptors with
ones generated automatically from the uimaFIT code. That should reduce
some synchronization problems when the Java code was changed but the
XML descriptor was not.

Steve

On Tue, Apr 15, 2014 at 8:52 AM, Miller, Timothy
<Timothy.Miller@childrens.harvard.edu> wrote:
> The discussion in the other thread with Abraham Tom gave me an idea I
> wanted to float to the list. We have been using some UIMAFit pipeline
> builders in the temporal project that maybe could be moved into
> clinical-pipeline. For example, look to this file:
>
> http://svn.apache.org/viewvc/ctakes/trunk/ctakes-temporal/src/main/java/org/apache/ctakes/temporal/pipelines/TemporalExtractionPipeline_ImplBase.java?view=markup
>
> with the static methods getPreprocessorAggregateBuilder() and
> getLightweightPreprocessorAggregateBuilder()   [no umls].
>
> So my idea would be to create a class in clinical-pipeline
> (CTakesPipelines) with static methods for some standard pipelines (to
> return AnalysisEngineDescriptions instead of AggregateBuilders?):
>
> getStandardUMLSPipeline()  -- builds pipeline currently in
> AggregatePlaintextUMLSProcessor.xml
> getFullPipeline() -- same as above but with SRL, constituency parsing,
> etc., every component in ctakes
>
> We could then potentially merge our entry points -- I think Abraham's
> experience points out that this is currently confusing, as well as
> probably not implemented optimally. For example, either
> ClinicalPipelineWithUmls or BagOfCUIsGenerator would use that static
> method to run a uimafit-style pipeline. Maybe we can slowly deprecate
> our xml descriptors too unless people feel strongly about keeping those
> around.
>
> Another benefit is that the cTAKES API is then trivial -- if you import
> ctakes into your pom file getting a UIMA pipeline is one UimaFit call:
>
> builder.add(CTAKESPipelines.getStandardUMLSPipeline());
>
>
> I think this would actually be pretty easy to implement, but hoping to
> get some feedback on whether this is a good direction.
>
> Tim
>
>
>

Mime
View raw message