stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Grisel <olivier.gri...@ensta.org>
Subject Re: Global vision to adapt Apache Stanbol to a CMS system
Date Tue, 16 Aug 2011 12:40:35 GMT
2011/8/16  <jeronimofernandez@yerbabuena.es>:
> Hi Stanbol devs,
>
> I've been working with Stanbol since two weeks ago. I need some information
> about how can I define a new ontology about products, users... in Apache
> Stanbol and how can i extract entities from reasoners to tag automatically a
> PDF document (i need to avoid tags from another sources, only the entities
> in my own OWL file).
>
> The targets i've reached are the following:
>
> 1-. I've put a OWL file containing the ontology in ontonet module with its
> curl call.
>
> 2-. I created scopes and recipes.
>
> I don't know how configure the environment to analyze a plain text and use
> this customised ontology to extract tags. I need some global vision about
> the problem and if it's possible some examples.
>
> If anyone can help me don't hesitate to answer me.

Hi,

You most probably don't need the reasoners to process *unstructured
data* such as natural language text content. Text analysis is achieved
thanks to Enhancement engines that can rely on the EntityHub to as a
domain specific knowledge base.

If the names of your entities are very specific to your domain (not
ambiguous) then the TaxonomyLinkingEngine coupled with a dedicated
referenced site in the EntityHub that indexes your knowledge base
sounds like the right approach.

To index your knowledge base within the EntityHub you can take example
on the following examples (for DBpedia and DBLP respectively):

  https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md
  https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dblp/README.md

However I don't have any usage example for configuring the
TaxonomyLinkingEngine and Rupert who is the original developer of this
module is off for a couple of weeks AFAIK.

Note: reasoners are useful to process *structured* data (a.k.a.
knowledge): converting assertions already expressed in one RDF
vocabulary (e.g. dbpedia.org) into another (e.g. schema.org), checking
integrity constraints, reifying transitive and reflexive relationships
prior to indexing...

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Mime
View raw message