ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <kottm...@gmail.com>
Subject Re: sentence detector newline behavior
Date Sun, 26 Jan 2014 14:58:07 GMT
On 01/25/2014 10:03 PM, Miller, Timothy wrote:
> On 01/25/2014 12:24 PM, Jörn Kottmann wrote:
>> The code which computes the spans tries to remove white space from it.
>> Removing the white space from a whitespace only sentence is causing
>> the exception your are seeing. Which response would you expect from
>> the sentence detector? Should a white space only sentence be returned?
> I would say no.
>
>> In case a sentence is terminated by a new line. Should the new line
>> char be included in the sentence span or not?
> I would also say no.
>
>
> I made a quick patch for this issue -- now it runs but scores really
> poorly compared to my model file (30 vs 75 or so). I suspect something
> is wrong with the evaluation, the spans being slightly off somehow.

The evaluation should ignore white spaces. I committed now my fix, it 
would be nice if you can
test it.

There might be still something wrong. In my test data I replaced all 
question marks with white spaces, and the result
is slightly worse than with the original data.

Jörn

Mime
View raw message