stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dileepa Jayakody <dileepajayak...@gmail.com>
Subject [GSoC] [Update] FOAF Co-reference based Entity Disambiguation in Stanbol
Date Tue, 17 Sep 2013 11:11:36 GMT
Hi All,

I have successfully implemented and tested the foaf disambiguation engine
with the help of Stanbol community including Rupert, Rafa and Andreas.

The engine's main functionality is to increase the confidence of
Entity-Annotations identified from previous engines, by using 2 fundamental
techniques;
1. Processing co-referencing URI references in the Entities to detect
connected-ness
2. Processing foaf:name comparison with fise:selected-text

The main objective of processing co-referencing URIs is to increase the
confidence of the most 'connected' entity from the suggested entities.
All URI/Reference type fields of the entities are extracted and processed
to find co-references with other entities suggested. The most connected
entity will have the most number of URI matches, and the
disambiguated-confidence will be increased accordingly.

The second technique used is literal matching of foaf:name field of the
entity with the fise:selected-texts in the content. With an exact match,
the disambiguated-confidence will be increased. Finally the cumulative
disambiguated-confidence is calculated and adjusted.

This engine requires Entity-Annotations extracted from previous engines,
and entityhub pre-configured with FOAF entities.
I'm using my foaf-site and available dbpedia in the enhancement-chain to
detect more entities and increase the effectiveness of the foaf
disambiguation engine.

The engine's source is hosted at github [1]
To execute the engine:
1. Build the maven project at github repository using command : "mvn clean
install"
2. Start the Stanbol engine and install the bundle:
org.apache.stanbol.enhancer.engines.disambiguation.foaf-1.0-SNAPSHOT.jar
3. Configure the foaf-site-chain with the new disambiguation engine and
dbpedialinking (to detect more entities)
(Note: To configure foaf-site-chain with pre-built foaf-site please refer
my mid-term update mail or follow the guide at [2])

The new engine is identified by : "disambiguation-foaf".
After configuring the enhancement-chain successfully the foaf-site-chain
should look like below;
Engines: langdetect, opennlp-sentence, opennlp-token, opennlp-pos,
foaf-site-linking, opennlp-ner, dbpediaLinking, disambiguation-foaf.

Please have a look at the disambiguation-foaf engine, try it out and give
your comments for improvements. I appreciate your advice very much.
I shall also create a Stanbol Jira with all details and attach the
source-code and built bundle after finalizing the project.

Thanks for all your support given throughout the project.

Regards,
Dileepa

[1] https://github.com/dileepajayakody/foaf-disambiguation
[2] https://github.com/dileepajayakody/FOAFSite

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message