ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wu, Stephen T., Ph.D." <Wu.Step...@mayo.edu>
Subject Re: sentence number in WordToken
Date Wed, 02 Oct 2013 15:30:52 GMT
Hmm, we should probably have a process to vote up or down type system
changes like this, since they affect everyone.
In this case I'd agree with the others: don't add it.

stephen



On 9/30/13 11:21 AM, "samir chabou" <samirchb@yahoo.com> wrote:

>thanks for the feed back it's a good point,
>I did it also with selectCovering but as Richard mention I'll changed to
>indexCovering since it's faster.
>Samir
>
>
>
>
>________________________________
> From: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
>To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>; samir chabou
><samirchb@yahoo.com>
>Sent: Monday, September 30, 2013 12:10:45 PM
>Subject: RE: sentence number in  WordToken
> 
>
>Samir,
>I think Richard has a good point here.   What is the use to require
>adding sentenceNumber() to BaseToken in the TypeSystem?
>If it's only temporary, It may be a good idea to do it programmatically
>with local variable rather than modifying the type system and having it
>stored in the CAS...?
>
>Maybe something like:
>boolean a = JCasUtil.isCovered(JCas, BaseToken1, Sentence.class);
>Boolean b = JCasUtil.isCovered(JCas, BaseToken2, Sentence.class);
>--Pei
>
>
>> -----Original Message-----
>> From: Richard Eckart de Castilho [mailto:rec@apache.org]
>> Sent: Monday, September 30, 2013 11:59 AM
>> To: dev@ctakes.apache.org; samir chabou
>> Subject: Re: sentence number in WordToken
>> 
>> Hi,
>> 
>> if you do many selectCovering calls, you may be faster using
>>indexCovering
>> once and then using the lookup index it produces.
>> 
>> IMHO type systems should not contain information that can easily be
>> calculated at runtime (e.g. sentence number, token number, etc.).
>> 
>> Mind, I have no say here ;) Just my personal opinion.
>> 
>> -- Richard
>> 
>> On 30.09.2013, at 16:17, samir chabou <samirchb@yahoo.com> wrote:
>> 
>> > Hi Pei,
>> >
>> > I though
>> > this may be have some use ...
>> >
>> > Because I
>> > need to know if two or more words tokens belong to the same sentence;
>> > and since WordToken does not define the feature sentence number. I
>> > added it to the TypeSystem. These are the steps:
>> >
>> > 1)      I added the sentence number
>> > features for the type BaseToken in TypeSystem.xml file (I choose the
>> > supper class in order that the feature be propagated to all subclasses
>> > (wordToken,SymboleToken,NumToken ...)
>> >
>> > 2)      In ctakes-core I in TokenizerAnnotatorPTB.java (methode
>> annotateRange) I set the new feature
>> > (BaseToken.sentenceNumber = sentence.getSentenceNumber()) as
>> shown below :
>> >
>> > bta.setSentenceNumber(sentence.getSentenceNumber());
>> >       bta.addToIndexes();
>> >
>> > 3)      Generate the JCASGen in the tab de TypeSystem of the
>> > aggregate
>> >
>> > 4)      Add the feature in the source
>> > tab of the aggregate
>> >
>> > Probably I
>> > could have used as alternative:
>> > List<Sentence> list = JCasUtil.selectCovering(aJcas, Sentence.class,
>> > entity1.getBegin(), entity1.getEnd()); the issue with this is : if I
>> > have many entities to be checked at the same time or if the entity1 is
>> > found in many places, I have to add some if conditions to get sentence
>> > number
>> >
>> >
>> > Thanks
>> > Samir


Mime
View raw message