uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Swirl <lriwsw...@gmail.com>
Subject Re: Designing collection readers: Reading multiple XML files containing multiple CASes
Date Thu, 10 Oct 2013 08:21:42 GMT
> 
> For part c:
> 
> I imagine an algorithm that can scan the main XML file and find the 
"sections". 
> For each section it finds, it can produce a CAS and initialize that CAS 
with the
> section's information.
> 
> If this algorithm lives inside an analysis component, then it can use the 
"CAS
> Multiplier" to produce the additional CASes, one for each segment.
> 
> See
> http://uima.apache.org/d/uimaj-
2.4.2/tutorials_and_users_guides.html#ugr.tug.cm
> 
> Is that what you're looking for, or is that off-base?
> 
> -Marshall
 
Yes, this was what I want.

I tried using CAS Multiplier. 
For most part it was working (e.g. when using in a 
SimplePipeline.runPipeline, CpePipeline.runPipeline).

But when I tried to use it in CollectionProcessingEngine, it only produced 1 
CAS, instead of the few CASes that were supposed to be produced from 1 input 
document.

Here are my steps:
a. create CR description "readerDesc" to read in a text file
b. create AnalysisEngineDescription "simpleTextSegmenterDesc" for 
SimpleTextSegmenter.class
create AnalysisEngineDescription "casConsumerWriterDesc" to write CAS into 
XMI files
c. AggregateBuilder aggregateBuilder = new AggregateBuilder();
aggregateBuilder.add(simpleTextSegmenterDesc);
aggregateBuilder.add(casConsumerWriterDesc);
AnalysisEngineDescription aaeDesc = 
aggregateBuilder.createAggregateDescription()
aaeDesc.getAnalysisEngineMetaData() 
.getOperationalProperties().setOutputsNew
CASes(false);
c. CpeBuilder builder = new CpeBuilder();
builder.setReader(readerDesc);
builder.setAnalysisEngine(aaeDesc);
e. CollectionProcessingEngine cpe = 
builder.createCpe(StatusCallbackListener);
f. cpe.process();

I only got 1 XMI produced instead of the few that I expected.

Is CAS Multiplier usable in CPE?
According to the documentation, I need to wrap it in a Aggregate AE with 



Mime
View raw message