Hi,
iteratePipeline and runPipeline should be mostly equivalent.
A difference occurs if you e.g. have a CAS multiplier within
an aggregate engine.
runPipeline delegates the execution to the UIMA core and is able
to handle CAS multipliers.
iteratePipeline (re)uses a single CAS instance which is passed
to the reader and all analysis engines in turn. It does not
support CAS multipliers.
A user recently pointed out that uimaFIT 2.2.0 reintroduces a bug
in iteratePipeline - typeSystemInit() is not called [1].
@Peter: could the missing call to typeSystemInit() be a problem for Ruta?
Cheers,
-- Richard
[1] https://issues.apache.org/jira/browse/UIMA-4998
> On 07.07.2016, at 09:17, Peter Klügl <peter.kluegl@averbis.com> wrote:
>
> Hi,
>
>
> I have no idea yet why the code with iteratePipeline does not work.
>
>
> Richard, do you have an idea?
>
>
> Are there any exceptions? Do you use the rae objects somewhere? Is your
> code hosted somewhere, e.g., on github? What do you mean by your own
> annotations? Annotations of an external type system or annotations added
> by another engine or reader?
>
>
> Best,
>
>
> Peter
>
>
> Am 06.07.2016 um 02:41 schrieb Bonnie MacKellar:
>> I have a very lengthy Ruta script which annotates my files successfully. I
>> can see all the annotations in AnnotationBrowser and they are correct.
>> I want to get all the annotations in a Java program, so I can count
>> occurrences. I am using uimaFit. I am getting very odd results.
>>
>> When I use CasDumpWriter, I see all my annotations, correctly written to
>> the dump file. Here is the code that does this
>> -------------------------------------------------------------------------------------------------------
>> AnalysisEngineDescription rutaEngineDesc =
>> AnalysisEngineFactory.createEngineDescription(RutaEngine.class,
>> RutaEngine.PARAM_MAIN_SCRIPT,
>> "ecClassifier",
>> RutaEngine.PARAM_SCRIPT_PATHS, new String[]
>> {"/home/bonnie/Research/eclipse-uima-projects/counttypes/src/main/ruta"},
>> RutaEngine.PARAM_DESCRIPTOR_PATHS, new String[]
>> {"/home/bonnie/Research/eclipse-uima-projects/counttypes/target/generated-sources/ruta/descriptor"},
>> RutaEngine.PARAM_ADDITIONAL_UIMAFIT_ENGINES,
>> "org.apache.uima.ruta.engine.PlainTextAnnotator");
>> AnalysisEngineDescription writerDesc =
>> AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class,
>> CasDumpWriter.PARAM_OUTPUT_FILE, "dump2.txt");
>> AnalysisEngine rae = AnalysisEngineFactory.createEngine(rutaEngineDesc);
>> SimplePipeline.runPipeline(readerDesc, rutaEngineDesc, writerDesc);
>> -----------------------------------------------------------------------------------------------------
>>
>> However, when I try to do this myself, using iteratePipeline to iterate
>> through the JCas structures for each input file, many of the annotations
>> are missing. I have a suspicion that the missing annotations are ones that
>> annotate text for which there is another annotation. For example, text
>> will be annotated with Line, and with my own annotation. My code to print
>> the annotations is based on the code in CasDumpWriter.
>>
>> -----------------------------------------------------------------------------------------------------
>>
>> for (JCas jcas : SimplePipeline.iteratePipeline(readerDesc,
>> rutaEngineDesc)) {
>> displayRutaResults(jcas);
>>
>>
>> public void displayRutaResults(JCas jcas)
>> {
>> System.out.println("in display ruta results");
>>
>> FSIterator<Annotation> annotationIter =
>> jcas.getAnnotationIndex().iterator();
>> while (annotationIter.hasNext())
>> {
>> AnnotationFS annotation = annotationIter.next();
>> System.out.println(annotation.getType().getName());
>> System.out.println(annotation.getCoveredText());
>>
>> System.out.println("------------------------------------------");
>> // System.out.println(annotation.toString());
>> }
>> }
>>
>> ------------------------------------------------------------------------------------------------
>>
>> Why would this code produce different results than CasDumpWriter, which
>> uses almost exactly the same code? Is it something to do with using
>> runPipeline vs iteratePipeline? Should I write my code so it can be placed
>> inside runPipeline?
>>
>> thanks so much!
>> Bonnie MacKellar
>>
>
|