opennlp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hannes Korte <>
Subject Re: OpenNLP Annotations Proposal
Date Wed, 22 Jun 2011 17:38:39 GMT
On 06/22/2011 06:50 PM, Olivier Grisel wrote:
> I am ok with switching to UIMA CAS. We might need additional metadata
> outside of the CAS annotations though. For instance if the annotators
> fixes a typo in the Sofa it-self, we might need to be able to tell
> that Sofa1 is subject to being replaced by Sofa2 according to
> annotator A1 for instance.

Do we have one CAS per sentence or one CAS per document? If the former
is the case, then we will need some more metadata around the CAS
documents to be able to show the context of a given sentence (if that is
needed at all). If the latter is the case, then this will lead to many
different Sofas, which only differ in a few characters, right?

If we want to add disambiguation and coref information into the
annotator UI at a later stage, then one CAS per document would be much
more useful.


View raw message