uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Hernandez <nicolas.hernan...@gmail.com>
Subject Re: Consuming RDF ontologies as dictionaries
Date Wed, 02 Nov 2011 08:45:46 GMT
Hi Spico,

for sure it is possible with UIMA. For now, you have components in the
sandbox (like the Dictionary Annotator or the Concept Mapper
Annotator) which aims at recognizing text forms in a text from

Personaly, I was not satisfied by the current solutions (either too
simple (no features can be associated with an entry of the
DictionaryAnnotator) or too complex to set up for me (the Concept
Mapper Annotator is based on a tokenizer which was different from
mine) ).

Based on a previous work of Jerome Rocheteau, I developed a simple
Dictionary Annotator with the following features
  * a dictionary is a uima resource (one instance can be shared by
multiple annotators) [1]
  * the dictionary design is abstract enough to allow several implementations.
  * right know it comes with one implementation of dictionary format :
CSV (one column is the entry and the others are feature values), but
XML RDF would be an easy
  * the dictionary entries are strings of characters which are stored
as a prefix tree of characters in order to process the recognition in
a fast way
  * it is not type system dependent

It would not cost to much to add an extension to deal XML RDF (only a
parser and the connector to the data structure). I have planed to open
the code soon but I can make it available sooner if you re interested
in participating in.
Anyway I ll be interested to know a bit more about how you wanted to
use your RDF format (what are the entries, the values...)

Best regards


[1] http://uima.apache.org/downloads/releaseDocs/2.1.0-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.aae.accessing_external_resource_files

On Mon, Oct 31, 2011 at 9:41 PM, Alexander Klenner
<alexander.garvin.klenner@scai.fraunhofer.de> wrote:
> Hi Florin,
> I think what you are looking for is an UIMA type system that corresponds to your specific
RDF ontologies (URIs). As far as I know you must implement this type system by hand (experienced
UIMA users please correct me if I am wrong here...).
> There is an RDF CAS Consumer to be found in the UIMA sandbox:
> http://uima.apache.org/sandbox.html#rdfcas.consumer
> that does it the other way round, an existing type system in a CAS is converted to RDF
triplestore format. But the created URIs from the typesystem change from one run to another
for the same artefact, which makes them not really usable in a bigger RDF context. But maybe
this could be a starting point for further investigation...
> Cheers,
> Alex
> --
> Dipl. Bioinformatiker Alexander G. Klenner
> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
> Schloss Birlinghoven, D-53754 Sankt Augustin
> Tel.: +49 - 2241 - 14 - 2736
> E-mail: alexander.garvin.klenner@scai.fraunhofer.de
> Internet: http://www.scai.fraunhofer.de
> ----- Ursprüngliche Mail -----
> Von: "Spico Florin" <spicoflorin@gmail.com>
> An: user@uima.apache.org
> Gesendet: Montag, 31. Oktober 2011 16:48:18
> Betreff: Consuming RDF ontologies as dictionaries
> Hello!
>  I'm newbie in UIMA. I would like to know if it is possible to create a
> dictionary (vocabulary) from a RDF triplestore. I would like that UIMA to
> be used to classify a words contained in a text by using a given ontology
> stored in a triplestore.
> How can I use UIMA in this particular use case?
>  I look forward for your answers.
>  Thank you.
>  Regards,
>  Florin

Dr. Nicolas Hernandez
Associate Professor (Maître de Conférences)
Université de Nantes - LINA CNRS
+33 (0)2 51 12 58 55
+33 (0)2 40 30 60 67

View raw message