uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: Best way to compare FSIndexes from different CASes on the same Sofa
Date Wed, 21 Apr 2010 09:08:41 GMT
On 4/20/2010 18:22, Bart Mellebeek wrote:
> Hi,
> 
> I was wondering about the following question: what is the best way to
> compare FSIndexes from different CASes on the same Sofa?
> I have two different CASes on the same SofaString: CAS1 was obtained as
> the result of running an Annotator; CAS2 was obtained as the result of
> manual processing using the CASEditor. Each CAS contains different
> FSIndexes. Is it possible to process CAS1, store the FSIndexes I need,
> then process CAS2 and compare the indexes or is the only valid way to
> merge the CASes and do the comparison on a single CAS?
> Thanks,
> 
> Bart

This is something that many people need, and I suspect it's
been implemented many times.  However, it's difficult to
create library functions to do this as the requirements are
often subtly different.

Anyway, what I've done and what I've seen other people do
is to write custom code that takes the two CASes and their
respective indexes, iterates over them in parallel and
compares them by whatever criteria are important to you
(the begin/end offsets may be all that's relevant, or there
may be more).

You can store your manually annotated CAS by writing it to
disk in XCAS or XMI format, and reading it back into a CAS
every time you need it.

--Thilo

Mime
View raw message