uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: Running an AnalysisEngine on part of a document
Date Tue, 16 Feb 2016 11:17:27 GMT
The easiest would be to remove the token/sentence annotations of those parts of the text that
you do not care about.
Or alternatively - if you have annotations that specifically mark the text sections, then
configure the segmenter component to create sentences/tokens only within the boundaries of
these annotations using PARAM_ZONE_TYPES and PARAM_STRICT_ZONING.


-- Richard

> On 16.02.2016, at 12:02, Nils Reiter <lists@nilsreiter.de> wrote:
> Hi,
> is there a way to run an analysis engine on only a part of the CAS?
> I have UIMA annotations over all the substrings that I want to process. The only way
I could think of is creating new views or CASs for each string, but that would result in >
100 views. Is there a more straightforward way?
> Background:
> Only part of the CAS contains natural language, other parts are lists, names and headers.
I would like to POS-tag the text, but not the rest.
> Thanks in advance for any pointers or suggestions,
> Nils

View raw message