uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benedict Holland <benedict.m.holl...@gmail.com>
Subject Run an analysis engine after processing document collection?
Date Fri, 22 Dec 2017 17:26:59 GMT
Hello All,

I find myself in a strange situation. I have a content processing engine
working. I have N threads populating N CAS objects and running my pipeline.
Each CAS object gets 1 piece of data, like say a row in a database. Each
process is entirely independent and can run concurrently. I specifically
did not configure this pipeline as an aggregate process as I don't really
care when the events trigger since the CPE maintains the order of the

Now I want to add an analysis that will run over the aggregate output. For
example, I processed N texts using the CPE and now I want to run a TF-IDF
analysis over the entire corpora. The TF-IDF analysis should only run once
all documents are processed.

How would I go about doing this? Does this have to do with not allowing
multiple deployments?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message