uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Tanenblatt <sloth...@park-slope.net>
Subject Re: Concept Mapper Annotator Question
Date Thu, 19 Feb 2015 16:37:26 GMT
Alex is correct—you need to generate all of the variants, e.g.:

<token canonical="Location" DOCNO="10000">
	<variant base = "New York City"/>
	<variant base = "New York"/>
	<variant base = "New City"/>
	<variant base = "York City"/>
</token>

This is because ConceptMapper does not allow for partial matching (though, as you have discovered,
you can set “OrderIndependentLookup” to to “true” to ignore the token ordering during
lookup).

Michael


> On Feb 19, 2015, at 11:06 AM, alex@nlpfu.com wrote:
> 
> Hi Alberto,
> 
> I would write a dictionary compiler to extend your base dictionary to all the variants
that you want to detect.
> 
> Hope this help,
> 
> Alex
> 
> Sent on the new Sprint Network
> 
> ----- Reply message -----
> From: "Alberto Garcia" <albertogarcia.garcia@gmail.com>
> To: <user@uima.apache.org>
> Subject: Concept Mapper Annotator Question
> Date: Thu, Feb 19, 2015 03:50
> 
> We are starting to use UIMA framework for entity identification. We base
> our solution on some dictionaries which contains the entities we need to
> recognize.
> 
> We are using the Concept Mapper annotator, and it works really fast
> recognizing the complete name of an entity, but it fails recognizing part
> of the entity, let me explain that with an example,
> 
> Lets say we have this entry on the dictionary:
> 
> 
> 
> <token canonical=*"Location"* DOCNO=*"10000"*>
> 
> <variant base = *"New York City"*/>
> 
> </token>
> 
> 
> 
> If we call the service with “*New York City” *as input text  it recognize
> the entity as Location,
> 
> If we call the service with “*New City York” *or different permutations it
> recognize the entity as Location,
> 
> BUT If we call the service with “*New City”*  it does not recognize it as a
> Location.
> 
> 
> 
> Can anyone tell me how I can implement or configure this behavior for the
> Concept Annotator?


Mime
View raw message