uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zesch, Torsten" <torsten.ze...@uni-due.de>
Subject Re: I want something to run last over agregates
Date Wed, 10 Oct 2018 15:32:27 GMT
Hi Ben,

each annotator can implement collectionProcessComplete().

Quoting from the documentation:
"The framework calls the collectionProcessComplete() method at the end of the collection (i.e.,
when all objects in the collection have been processed). At this point in time, no CAS is
passed in as a parameter. This gives the CAS Consumer or Analysis Engine an opportunity to
perform collection processing over the entire set of objects in the collection."

In our implementation of tf.idf, we have an annotator collect the tf score for each document
in process() and computes the idf part in collectionProcessComplete().

-Torsten

´╗┐On 10.10.18, 17:21, "Benedict Holland" <benedict.m.holland@gmail.com> wrote:

    Hello all,
    
    I continue to have a problem that comes up a lot. I have a collection
    processing engine. I want something to run after all of the processing is
    done. For example, I have a collection of texts and want to run a tf-idf. I
    generate a tf for each document and at the end, I generate an idf over the
    collection. I can't put that in an annotator as part of my
    processing pipeline.
    
    Is there an aggregate annotator that will run after the entire collection
    is processed?
    
    Thanks,
    ~Ben
    

Mime
View raw message