uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eddie Epstein <eaepst...@gmail.com>
Subject Re: Suggestion for CPE stats
Date Fri, 23 Jul 2010 01:36:05 GMT
Hi Eric,

I'm not sure which, but one of the UIMA command line tools does report total
document size at the end of processing.

However, some problems with this suggestion. The UIMA framework pretty much
just moves CASes around without looking inside them. If it did look
inside, which
view would it look at? What about non text artifacts?

My answer would be to make this an application design issue. Have a CAS
consumer do the count and make it available at collection process complete.

Eddie

On Tue, Jul 20, 2010 at 4:22 PM, Eric Riebling <er1k@cs.cmu.edu> wrote:
> Although it's useful to know how many documents have been processed,
> that figure is not nearly as useful as how many CHARACTERS have been
> processed by a given CPE or set of components within a CPE.  Since,
> if your documents are tiny, processing per document is much faster
> than if they are huge.
> So I think it would be a great thing to include Characters Processed
> in the stats window of the Performance Report.
> --
> Eric Riebling  GHC 6713,  LTI,   SCS,  CMU
> 412.268.9872   http://www.cs.cmu.edu/~er1k
>

Mime
View raw message