stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Grisel <olivier.gri...@ensta.org>
Subject Re: Global vision to adapt Apache Stanbol to a CMS system
Date Wed, 17 Aug 2011 12:55:39 GMT
2011/8/17  <jeronimofernandez@yerbabuena.es>:
> Olivier Grisel <olivier.grisel@ensta.org> escribió:
>
>> 2011/8/17  <jeronimofernandez@yerbabuena.es>:
>>>
>>> Hi again,
>>>
>>> I've been looking the Dbpedia and DBLP examples, but actually i need to
>>> define my own ontology with customised relations (not based in
>>> definitions
>>> of Dbpedia). I didn't explain my problem very well. I need to extract
>>> tags
>>> from plain text, yes, but i need to make inferences over my domain
>>> (enterprise terms and concepts that are only used in a closed
>>> environment).
>>
>> What kind of inference? Please give an example of sample input and
>> expected output.
>>
>
> Input -> PTS is a component of a product for cleaning mowers.
> Inference -> PTS is a product for gardening (i defined this in a
> hierarchical ontology, superclass).

I don't get it: are the input and the output plain English text
(natural language) or RDF. Please give an example with the exact input
and output. Stanbol does not provide any component to generate natural
language.

>>> Is there any way to upload my ontology previously defined in Protégé with
>>> mimetype rdf+xml? I thought that when i uploaded the owl file to a
>>> specific
>>> scope in ontonet, i could configure Stanbol to use this ontology.
>>
>> What do you mean by "use"?
>>
>
> I want that Stanbol "forgets" another source. I need that, in the plain
> text, Apache Stanbol choose tags only defined in my ontology.

You can configure the TaxonomyLinkingEngine to only point to use one
of the EntityHub referenced sites (and not all of them). Alternatively
you can unregister the configuration of dbpedia in the entityhub to
globally disable it.

> Example:
>
> I have a PTS but i don't have garden.
>
> Result:
>
> PTS->Component_for_mowers->Component_for_gardening
>
> Apache Stanbol don't extract "garden" because it isn't defined like an
> entity (class) in my owl file or my own ontology.
>
>>> I've
>>> already created this ontology. If it's not possible, what are the steps i
>>> need to follow in Apache Stanbol to define own entities that Natural
>>> Language Processing Engine can extract from plain text? And then, when
>>> these
>>> own entities has been extracted, how can i connect these entities with my
>>> relations?
>>
>> What do you mean by "connecting entities with relations"?
>>
>
> Relation:
>
> Maybe_must_buy: Gardener->Component_for_gardening

So what you call a relation is a rule for a deductive reasoner?

> Example:
>
> I have a PTS. Peter is a gardener.
>
> Result:
>
> PTS->Component_for_cleaning_mowers->Component_for_gardening
> Peter->Gardener
> Peter Maybe_must_buy PTS
>
> Sorry for the example :). I expect you understand what i want to say.

So if I understand correctly you want to do compute all the
consequences (logical deductions) of the union of the logical facts
extracted from the natural language text of the submitted content with
a set of predefined rules registered on the Stanbol server.

The main problem I see is that right now we don't have any enhancement
engine in Stanbol able to semantically parse / interpret English
language content to extract "facts" or "semantic assertions". We just
have engines to extract occurrences of named entities and or non
ambiguous domain specific terms. This is a very non trivial task and
OpenNLP only provides some building block to implement this.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Mime
View raw message