uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Possible bug with DictionaryAnnotator and escaped characters
Date Tue, 19 Apr 2011 08:34:02 GMT
Hi all,

I've just noticed an unexpected behavior in DictionaryAnnotator: if you
create a dictionary with the DictionaryCreator and your input file (text
file with one entry per line) contains characters like & or ' then they get
converted to their escaped version &amp; or &apos; as it's right in XML
syntax; the problem is that such entries don't match correctly with the
original entry string.
So a line like me&co will be written as <entry><key>me&amp;co</key></entry>
inside the dictionary.xml but neither the string "me&co" nor "me&amp;co"
will generate a match (and thus a DictionaryAnnotation).

Is it me missing something?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message