uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kannan Chellappa" <kchella...@kana.com>
Subject CPM batch processing
Date Mon, 12 Nov 2007 07:06:26 GMT
I want to process my document collection using CPM and I want to use the
batch feature.

The documentation says that the following method in the


               void process(CollectionReader
collection\CollectionReader.html>  aCollectionReader,
             int aBatchSize)

             throws ResourceInitializationException


breaks the processing into batches of size determined by the aBatchSize
parameter. Each CasConsumer will be notified at the end of the batch.


When I tried this method in my application, the processing stops after
processing the first batch of documents.  I was hoping that the
execution would continue to next batch of documents after each batch
processing is complete.


I tried the following as a test.


I downloaded uimaj-2.2.0  binaries into my computer and used
SimpleRunCPM in examples to perform my test


I modified the SimpleRunCPM.java in org.apache.uima.examples.cpe and
changed the batch size to 4 (instead of 10) and then ran the following
command line arguments






I modified the FileSystemCollectionReader.xml to have the default as


The input folder has 8 text files, but the processing completes after 4

Is this the expected behavior? If not is there anything I need to change
in the code to get the multiple batches to work?


Thanks in advance for any help



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message