uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Tanenblatt <sloth...@park-slope.net>
Subject Re: Consuming RDF ontologies as dictionaries
Date Wed, 02 Nov 2011 10:51:47 GMT
Just FYI regarding ConceptMapper: one of the key design points of ConceptMapper is that *any*
tokenizer annotator can be used--the one supplied is just an example--and you can set it up
so that that tokenizer is also used to tokenize your dictionary, to minimize missed matches
due to differing tokenizations between the input text and dictionaries.


On Nov 2, 2011, at 4:45 AM, Nicolas Hernandez wrote:

> Hi Spico,
> 
> for sure it is possible with UIMA. For now, you have components in the
> sandbox (like the Dictionary Annotator or the Concept Mapper
> Annotator) which aims at recognizing text forms in a text from
> dictionaries.
> 
> Personaly, I was not satisfied by the current solutions (either too
> simple (no features can be associated with an entry of the
> DictionaryAnnotator) or too complex to set up for me (the Concept
> Mapper Annotator is based on a tokenizer which was different from
> mine) ).
> 
> Based on a previous work of Jerome Rocheteau, I developed a simple
> Dictionary Annotator with the following features
>  * a dictionary is a uima resource (one instance can be shared by
> multiple annotators) [1]
>  * the dictionary design is abstract enough to allow several implementations.
>  * right know it comes with one implementation of dictionary format :
> CSV (one column is the entry and the others are feature values), but
> XML RDF would be an easy
>  * the dictionary entries are strings of characters which are stored
> as a prefix tree of characters in order to process the recognition in
> a fast way
>  * it is not type system dependent
> 
> It would not cost to much to add an extension to deal XML RDF (only a
> parser and the connector to the data structure). I have planed to open
> the code soon but I can make it available sooner if you re interested
> in participating in.
> Anyway I ll be interested to know a bit more about how you wanted to
> use your RDF format (what are the entries, the values...)
> 
> Best regards
> 
> /Nicolas
> 
> [1] http://uima.apache.org/downloads/releaseDocs/2.1.0-incubating/docs/html/tutorials_and_users_guides/tutorials_and_users_guides.html#ugr.tug.aae.accessing_external_resource_files
> 
> 
> 
> On Mon, Oct 31, 2011 at 9:41 PM, Alexander Klenner
> <alexander.garvin.klenner@scai.fraunhofer.de> wrote:
>> Hi Florin,
>> 
>> I think what you are looking for is an UIMA type system that corresponds to your
specific RDF ontologies (URIs). As far as I know you must implement this type system by hand
(experienced UIMA users please correct me if I am wrong here...).
>> 
>> There is an RDF CAS Consumer to be found in the UIMA sandbox:
>> 
>> http://uima.apache.org/sandbox.html#rdfcas.consumer
>> 
>> that does it the other way round, an existing type system in a CAS is converted to
RDF triplestore format. But the created URIs from the typesystem change from one run to another
for the same artefact, which makes them not really usable in a bigger RDF context. But maybe
this could be a starting point for further investigation...
>> 
>> Cheers,
>> 
>> Alex
>> 
>> 
>> 
>> --
>> Dipl. Bioinformatiker Alexander G. Klenner
>> Fraunhofer-Institute for Algorithms and Scientific Computing (SCAI)
>> Schloss Birlinghoven, D-53754 Sankt Augustin
>> Tel.: +49 - 2241 - 14 - 2736
>> E-mail: alexander.garvin.klenner@scai.fraunhofer.de
>> Internet: http://www.scai.fraunhofer.de
>> 
>> 
>> ----- Ursprüngliche Mail -----
>> Von: "Spico Florin" <spicoflorin@gmail.com>
>> An: user@uima.apache.org
>> Gesendet: Montag, 31. Oktober 2011 16:48:18
>> Betreff: Consuming RDF ontologies as dictionaries
>> 
>> Hello!
>>  I'm newbie in UIMA. I would like to know if it is possible to create a
>> dictionary (vocabulary) from a RDF triplestore. I would like that UIMA to
>> be used to classify a words contained in a text by using a given ontology
>> stored in a triplestore.
>> How can I use UIMA in this particular use case?
>>  I look forward for your answers.
>>  Thank you.
>>  Regards,
>>  Florin
>> 
> 
> 
> 
> -- 
> Dr. Nicolas Hernandez
> Associate Professor (Maître de Conférences)
> Université de Nantes - LINA CNRS
> http://enicolashernandez.blogspot.com
> http://www.univ-nantes.fr/hernandez-n
> +33 (0)2 51 12 58 55
> +33 (0)2 40 30 60 67


Mime
View raw message