stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Grisel <olivier.gri...@ensta.org>
Subject Re: Stanbol Enhancement Structure (discussion)
Date Sun, 06 Mar 2011 22:48:26 GMT
2011/3/6 Rupert Westenthaler <rupert.westenthaler@gmail.com>:
> On Fri, Mar 4, 2011 at 5:55 PM, Olivier Grisel <olivier.grisel@ensta.org> wrote:
>
> Just to be sure ... in the diagram the sb:EntityAnnotation and the
> sb:TextOccurrence would be created by the NLP engine and the
> EntitySuggestion would be created by an EntityTaggingEngine (e.g. the
> current Autotagger).
>
> That would mean that the NLP Engine can detect that "John Smith" and
> "Mr Smith" are the same Entity? Your issue is therefore - and in such
> a case that would be correct - that a single Annotation would need to
> have multiple Occurrences.
> If we need to model something like that, that we need to create an own
> resource for each Occurrence and link them with an relation e.g.
> "sb:hasOccurrence" to the Annotation.

Yes this roughly what I described in the diagram.

> Regarding the naming I would suggest to remove the "Entity". This
> would mean to use "Annotation", "Suggestion" and "Occurrence" with
> sub-types "TextOccurrence", "MetadataOccurrence" ...

Ok for removing Entity in the base class names. But I would also like
to have a subclass EntityAnnotation with the properties sb:entity-type
defined on it.

For Suggestion I would rather name it LinkingSuggestion,
LinkSuggestion or ResolutionSuggestion to make it explicit that is
this type of enhancement is about suggesting to link to a resource
from the LOD cloud or from a private LinkedData knowledge base (using
the entityhub as proxy).

> If you confirm that we need multiple TextOccurrences for a single
> Annotation, than I will made the according changes to the
> specification.

Yes I confirm this. The Temis annotation engine does the same kind
modeling (except that resolution is somehow implicit). If I remember
correctly, OpenCalais has also a notion of unresolved entity (or local
/ ambiguous entity) with potentially many occurrences in the document
and optionally links to "resolved" entities (for famous person,
organizations and places) that occur in many documents.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Mime
View raw message