uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From swirl <swirl...@yahoo.com>
Subject Re: Using uima pipeline as an API
Date Thu, 25 Jul 2013 02:15:03 GMT
Richard Eckart de Castilho <richard.eckart@...> writes:
 
> 
> You should take a look at the JCasIterable (cf. [1] - Example in Groovy, 
but
> JCasIterable is a Java class and works nicely in Java too, just I have no 
> example in Java).
> 
> JCasIterable basically allows you to iterate over the CASes produced by 
your
> pipeline. In such a look, you can extract and collect the data you need 
from
> the CASes, e.g. putting into a List<String> and returning it. Mind that 
you
> should *not* try to keep hold of full CASes, FeatureStructure (including
> Annotations and stuff). You need to copy the data from the CAS, otherwise
> it will be corrupted.




Hi Richard,
I was reading your reference for using JCasIterable 
(https://code.google.com/p/dkpro-core-asl/wiki/GroovyRecipies#OpenNLP_Part-
of-speech_tagging_pipeline_using_JCasIterable_and_c), but i have some 
questions.

Your example creates a JCasIterable using the following codes:

def pipeline = new JCasIterable(
  createReaderDescription(TextReader,
    TextReader.PARAM_PATH, args[0],
    TextReader.PARAM_LANGUAGE, args[1],
    TextReader.PARAM_PATTERNS, ["[+]*.txt"]),
  createEngineDescription(OpenNlpSegmenter),
  createEngineDescription(OpenNlpPosTagger));

I assume that createReaderDescription(), createEngineDescription() are 
return CollectionReaderDescription and AnalysisEngineDescription 
respectively. But when I looked at the constructor for JCasIterable, it only 
accepts CollectionReader and AnalysisEngine array:
 JCasIterable(final CollectionReader aReader, final AnalysisEngine... 
aEngines)

Why is this so?








Mime
View raw message