incubator-stanbol-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enrico Daga (JIRA)" <>
Subject [jira] Created: (STANBOL-107) semantic description of the engines
Date Tue, 01 Mar 2011 17:13:36 GMT
semantic description of the engines

                 Key: STANBOL-107
             Project: Stanbol
          Issue Type: New Feature
          Components: Enhancer
            Reporter: Enrico Daga
            Priority: Minor

It would be nice to find a way to let engines declare which is the contribution they are going
to provide.
I see at  least the following kinds of enhancements:
1) tagging: detect keywords, entities, concepts "within" the content
2) categorization/classification: locate the content in a conceptual place within a given
framework. For example an engine could state that the document has "Secon World War" as primary
topic, or "Theatre" in the framework of DBPedia categories, or state that is an E-mail, or
a News, in the framework of the CMS document types;
3) metadata: the engine extracts metadata from within the content. For instance it returns
the PDF metadata in RDF using the dublin core vocabulary
4) embedded knowledge: the source document is a rich HTML (with RDFa, Microformats) or it
is a structured file (why not an RDF file, say a FOAF profile?)

then, the enhancement engine should also say HOW it contributes in terms of vocabulary elements
1) Does the engine add annotation roles?
2) Does it add entity types?
3) Which metadata fields it will return?

This could be done with an RDF description stating which are the terms the engine will introduce
in relation to the ones of the Stanbol Enhancement base ontology (STANBOL-52). This is also
related to STANBOL-3.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message