stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rupert Westenthaler <rupert.westentha...@gmail.com>
Subject Re: Global vision to adapt Apache Stanbol to a CMS system
Date Tue, 30 Aug 2011 07:13:35 GMT
Hi Jero

I am replying to the original mail to keep the quoted text more easily
to read. However

On Tue, Aug 16, 2011 at 1:49 PM,  <jeronimofernandez@yerbabuena.es> wrote:
> Hi Stanbol devs,
>
> I've been working with Stanbol since two weeks ago. I need some information
> about how can I define a new ontology about products, users... in Apache
> Stanbol and how can i extract entities from reasoners to tag automatically a
> PDF document (i need to avoid tags from another sources, only the entities
> in my own OWL file).

As Olivier has already pointed out, the TaxonomyLinkingEngine should
be capable to extract Concepts of your Ontology from the natural
language text based on the labels of the Concepts within the Ontology.
As a result of the enhancement process you would get
* TextAnnotations with the occurrences of the Entities within the
parsed Text that link to one or more
* EntityAnnotations with the links to a Concept part of your Ontology

So what you could expect is a TextAnnotation with the selected-text
"PTS" that is linked to an EntityAnnotation with for the entity
"http://www.your-ontology.com/concepts#PTS"

Reasoning - such as also tagging Documents with parent Concepts of a
Taxonomy or a sub-class hierarchy - is currently not supported.
However such features could be added by implementing an additional
Enhancement Engine.

Therefore the results of the enhancement will not include an
EntityAnnotation for
"http://www.your-ontology.com/concepts#Component_for_mowers" nor
"http://www.your-ontology.com/concepts#Component_for_gardening"

If you are interested in an EnhancementEngine that would also support
adding Annotations like this the Developers of the Ontonet, Reasoning
and Rules modules are the best people to ask.

>
> The targets i've reached are the following:
>
> 1-. I've put a OWL file containing the ontology in ontonet module with its
> curl call.
>
> 2-. I created scopes and recipes.
>
> I don't know how configure the environment to analyze a plain text and use
> this customised ontology to extract tags. I need some global vision about
> the problem and if it's possible some examples.
>
> If anyone can help me don't hesitate to answer me.
>

The TaxonomyLinkingEngine can not use scopes and recipes of the
ontonet module. This is mainly because it depends on fast full text
queries that can not be provided by the ontonet component.
You need to import your Ontology into the Entityhub and than configure
an instance of the TaxonomyLinkingEngine accordingly.

How to do this (and also different options on how to do this) is
described in detail by this email [1].


Thanks for your interest in Apache Stanbol!
best
Rupert Westenthaler


[1] http://markmail.org/message/52266yl5ohijxiof



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Mime
View raw message