ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: Help needed with document creation time/date
Date Wed, 13 Jul 2016 19:00:02 GMT
Basically, you just want to create a TimeMention.

Here is a short example:

      final String docText = jcas.getDocumentText();
      final Matcher dateMatcher = DATE_PATTERN.matcher( docText );
      if ( dateMatcher.matches() ) {
         final TimeMention docTime = new TimeMention( jcas );
         docTime.setBegin( dateMatcher.start( 1 ) );
         docTime.setEnd( dateMatcher.end( 2 ) );
         docTime.setId( 0 );
         docTime.addToIndexes();
      }

If you do want to use the org.cleartk.timeml.type.DocumentCreationTime class then you can
do so.  For later fetching and use, with a TimeMention you'll rely on the class type and id
while on the DocumentCreationTime you can just use the class type.  

Sean

-----Original Message-----
From: Abramowitsch, Peter [mailto:pabramowitsch@hearst.com] 
Sent: Wednesday, July 13, 2016 2:47 PM
To: dev@ctakes.apache.org
Subject: Re: Help needed with document creation time/date

Thanks Sean.  Great advice.

I have a regexNER, but didn't go that route because it looked as if there was an inbuilt mechanism
waiting to be activated.
Say I know the time from some external source, is there a kosher way I can inject it into
the CAS as a creation time property so that it can be retrieved later by a client that knows
only the serialized CAS?

Peter

On 7/13/16, 11:41 AM, "Finan, Sean" <Sean.Finan@childrens.harvard.edu>
wrote:

>Hi Peter,
>
>Our group has used two different approaches, depending upon the note type:
>1.  Use a custom AE that creates creation time based upon a regex.  
>This works well for notes that have a header or footer with a known format.
>2.  Use the last normalized temporal expression.  For my test notes 
>this worked more frequently than you would think (~90%), but I would 
>not go this route unless you have thoroughly thought about what is in 
>your notes and how you are going to use the document creation time.
>
>That is all that we've done with respect to getting the creation time 
>from the actual text.  If you have any kind of structured data tied to 
>the note that indicates date, then you can tie things (e.g. doctimerel,
>doctime) together post-process.  We are doing this in one project.
>
>Sean
>
>-----Original Message-----
>From: Abramowitsch, Peter [mailto:pabramowitsch@hearst.com]
>Sent: Wednesday, July 13, 2016 2:33 PM
>To: dev@ctakes.apache.org
>Subject: Help needed with document creation time/date
>
>Hello All
>
>How can I get Ctakes to deduce the document creation datetime from the 
>text.  I have a pipeline including the following engines Basic Token 
>Processing FastUMLS
>
>Zoner
>
>ClearNLPDependencyParserAE
>
>PolarityCleartkAnalysisEngine
>
>UncertaintyCleartkAnalysisEngine
>
>HistoryCleartkAnalysisEngine
>
>ConditionalCleartkAnalysisEngine
>
>GenericCleartkAnalysisEngine
>
>SubjectCleartkAnalysisEngine
>
>EventAnnotator
>
>AnalysisEngineFactory.createEngineDescription(CopyPropertiesToTemporalE
>ven
>tAnnotator.class)
>
>DocTimeRelAnnotator
>
>BackwardsTimeAnnotator
>
>EventTimeRelationAnnotator
>
>EventEventRelationAnnotator
>
>
>I see that there is a DocumentCreationTime type, but it seems to be 
>initialized from inside one of the ClearTKAnnotators.
>
>I cannot find any documentation and don't know if it is looking for 
>particular manifestations in the text or whether a property needs to be 
>set externally on the JCAS or one of the SOFAs.
>
>
>Any help out there? Examples?
>
>
>Many thanks,
>
>Peter


Mime
View raw message