uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronny Hapke <ronny.ha...@empolis.com>
Subject Problem with WORDLISTs and WORDTABLEs where an entry starts with a shared substring of another entry
Date Mon, 28 Sep 2015 11:06:02 GMT
I've stumbled upon a problem with UIMA Ruta Workbench 2.3.1 in Eclipse 
Luna 4.4.2. Whenever working with a WORDLIST or WORDTABLE where one entry 
starts with a common substring of another one, it will not be recognized 
and therefore not annotated.

Consider this minimal example:

WORDLIST "Keywords.txt"in resources directory with the following entries:
Bill Clinton
Billy

Input file in input directory with the following contents:
Billy wished he was president, just like Bill Clinton once was.

Main.ruta script in scripts directory:
WORDLIST list = 'Keywords.txt';
DECLARE president;
Document {->MARKFAST(president, list)};

Upon execution, only Bill Clinton will be annotated while Billy will be 
ignored.

Any help/hints/comments appreciated!

Best regards, 
Ronny


Mime
View raw message