ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From samir chabou <samir...@yahoo.com>
Subject Re: sentence number in WordToken
Date Mon, 30 Sep 2013 16:21:20 GMT
thanks for the feed back it's a good point,
I did it also with selectCovering but as Richard mention I'll changed to indexCovering since
it's faster.
Samir




________________________________
 From: "Chen, Pei" <Pei.Chen@childrens.harvard.edu>
To: "dev@ctakes.apache.org" <dev@ctakes.apache.org>; samir chabou <samirchb@yahoo.com>

Sent: Monday, September 30, 2013 12:10:45 PM
Subject: RE: sentence number in  WordToken
 

Samir,
I think Richard has a good point here.   What is the use to require adding sentenceNumber()
to BaseToken in the TypeSystem?
If it's only temporary, It may be a good idea to do it programmatically with local variable
rather than modifying the type system and having it stored in the CAS...?

Maybe something like:
boolean a = JCasUtil.isCovered(JCas, BaseToken1, Sentence.class);
Boolean b = JCasUtil.isCovered(JCas, BaseToken2, Sentence.class);
--Pei


> -----Original Message-----
> From: Richard Eckart de Castilho [mailto:rec@apache.org]
> Sent: Monday, September 30, 2013 11:59 AM
> To: dev@ctakes.apache.org; samir chabou
> Subject: Re: sentence number in WordToken
> 
> Hi,
> 
> if you do many selectCovering calls, you may be faster using indexCovering
> once and then using the lookup index it produces.
> 
> IMHO type systems should not contain information that can easily be
> calculated at runtime (e.g. sentence number, token number, etc.).
> 
> Mind, I have no say here ;) Just my personal opinion.
> 
> -- Richard
> 
> On 30.09.2013, at 16:17, samir chabou <samirchb@yahoo.com> wrote:
> 
> > Hi Pei,
> >
> > I though
> > this may be have some use ...
> >
> > Because I
> > need to know if two or more words tokens belong to the same sentence;
> > and since WordToken does not define the feature sentence number. I
> > added it to the TypeSystem. These are the steps:
> >
> > 1)      I added the sentence number
> > features for the type BaseToken in TypeSystem.xml file (I choose the
> > supper class in order that the feature be propagated to all subclasses
> > (wordToken,SymboleToken,NumToken ...)
> >
> > 2)      In ctakes-core I in TokenizerAnnotatorPTB.java (methode
> annotateRange) I set the new feature
> > (BaseToken.sentenceNumber = sentence.getSentenceNumber()) as
> shown below :
> >
> > bta.setSentenceNumber(sentence.getSentenceNumber());
> >       bta.addToIndexes();
> >
> > 3)      Generate the JCASGen in the tab de TypeSystem of the
> > aggregate
> >
> > 4)      Add the feature in the source
> > tab of the aggregate
> >
> > Probably I
> > could have used as alternative:
> > List<Sentence> list = JCasUtil.selectCovering(aJcas, Sentence.class,
> > entity1.getBegin(), entity1.getEnd()); the issue with this is : if I
> > have many entities to be checked at the same time or if the entity1 is
> > found in many places, I have to add some if conditions to get sentence
> > number
> >
> >
> > Thanks
> > Samir
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message