ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: YTEX cTAKES 3.1.1 ready
Date Thu, 06 Feb 2014 18:27:06 GMT
Right, got it.  I just wanted to let you know that some EMR notes -do- require sentence splitting
at newline characters.

-----Original Message-----
From: vijay garla [mailto:vngarla@gmail.com] 
Sent: Thursday, February 06, 2014 1:06 PM
To: dev@ctakes.apache.org
Cc: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org; vlad.valtchinov@gmail.com
Subject: Re: YTEX cTAKES 3.1.1 ready

The cTAKES sentence detector is not changed in the YTEX branch.  The YTEX branch has an *additional*
sentence detector that does not automatically split sentences on newlines - users can use
this if they like.

-vj


On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Vijay,
>
> >  I have yet to run across clinical text from a real EMR where 
> > newlines
> represent the end of a sentence
>
> Since James pointed out this possibility a couple weeks ago, I have 
> kept my eyes open.  The problem is pretty ubiquitous in a corpus that 
> I'm working with right now.  I just opened the first note and gave it 
> a count ... 95 lines total, 9 are sentence/phrase (lacking punctuation) endings.
>  This is not including lists, which comprise about half of the note.
> One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".
>  Depending upon how cTakes deals with it, the meaning could change 
> drastically.
>
> > I believe cTAKES absolutely has to support sentences with newlines
> within them
>
> Yes, cTakes should do so, but I hope that you aren't suggesting that 
> it only support such a structure.
>
> Where is that easy button?
>
> -----Original Message-----
> From: vijay garla [mailto:vngarla@gmail.com]
> Sent: Thursday, February 06, 2014 10:31 AM
> To: dev@ctakes.apache.org
> Cc: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org; 
> vlad.valtchinov@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
>
> I believe it is worth migrating to trunk.
>
> Note that the sentence detector is also complementary - the existing 
> ctakes sentence detector is unchanged - users can choose which 
> sentence detector to use.  There are changes to assertion & dependency 
> parsing to support sentences without newlines, and that works with 
> both sentence detectors.
>
> I believe cTAKES absolutely has to support sentences with newlines 
> within them - I have yet to run across clinical text from a real EMR 
> where newlines represent the end of a sentence - the changes to 
> assertion & dependency parsing will have to be done at some point.
>
> -vj
>
>
> On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
> <Pei.Chen@childrens.harvard.edu>wrote:
>
> > VJ,
> > Aside from the changes to the existing cTAKES code (sentence 
> > detector,
> > etc.) [which we could leave out if it's still being debated], Do you 
> > think it's worth migrating the ytex code to trunk at this point?
> >  As you mentioned earlier, it's largely complementary.
> > [I was just thinking of saving effort to maintain the separate 
> > branch and for simplicity for dev...]
> >
> > --Pei
> >
> > > -----Original Message-----
> > > From: vijay garla [mailto:vngarla@gmail.com]
> > > Sent: Wednesday, February 05, 2014 9:30 PM
> > > To: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org; 
> > > vlad.valtchinov@gmail.com
> > > Subject: Re: YTEX cTAKES 3.1.1 ready
> > >
> > > Hi Vlad,
> > >
> > > I Updated the umls install guide; see
> > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> > >
> > > I would prefer to add the docs in the ctakes confluence, but as 
> > > far as I
> > can
> > > tell, I don't have write access there - can somebody give me write
> > privileges
> > > on the ctakes confluence site?
> > >
> > > There was a bug in the umls install; copy
> > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > > ytex/scripts/data/build.xmlover
> > > the corresponding file in your ctakes-3.1.2 install
> > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.
> > > The import is currently running on the UMLS 2013AA (I assume this 
> > > will
> > complete
> > > without issues as long as the umls schema hasn't changed from 2012).
> > >
> > > what trial and error did you have to go through to build the distro?
> > >
> > > -vj
> > >
> > >
> > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla <vngarla@gmail.com> wrote:
> > >
> > > > Hi Vlad,
> > > >
> > > > sorry that the instructions aren't clear.
> > > >
> > > > re 1) What I am trying to say is install 
> > > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from 
> > > > 3.1.1).  After that you still have to apply the lib and 
> > > > resources (these are things that cannot be distributed via apache).
> > > >
> > > > re 2) Yes, I need to update those docs.  Hopefully will get to 
> > > > that at some point.  However, I assume you already have a UMLS 
> > > > DB (also assume SQL Server).  If you can't/don't want to use 
> > > > your existing umls DB, please tell me.  The I'll priortize 
> > > > upgrading the doc on importing the umls tables (the scripts are there).
> > > >
> > > > best,
> > > >
> > > > VJ
> > > >
> > > >
> > > > On Wed, Feb 5, 2014 at 4:44 PM, <vlad.valtchinov@gmail.com> wrote:
> > > >
> > > >> Hi VJ-
> > > >>
> > > >> so, with trial and error were able to make the distribution and 
> > > >> now have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> > > >>
> > > >> Here's what's unclear.
> > > >>
> > > >> 1. Is now this the only (combined) thing that you need for 
> > > >> ctakes
> > > >> 3.1.1 + Ytex?
> > > >> the current documentation (https://code.google.com/p/yte 
> > > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> > > >> lation_cTAKES_3_1)
> > > >> which most probably is outdated, talks about installing cTakes
> > > >> 3.1.1 first and then applying 2 SNAPSHOT archives 
> > > >> (downloadable) , lib and resources.
> > > >> This is a confusion point.
> > > >>
> > > >> 2. The directions to import UMLS subset are then outdated as well.
> > > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) 
> > > >> to import the RRF files for the UMLS subset and then just use 
> > > >> the resulting db. Thoughts?
> > > >>
> > > >> Thanks,
> > > >> Vlad Valtchinov
> > > >> Brigham Rad
> > > >>
> > > >>
> > > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
> > > >>
> > > >>> Hi Vlad,
> > > >>>
> > > >>>
> > > >> All of ytex has been moved into ctakes, it is currently in a 
> > > >> branch (
> > > >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You 
> > > >>> don't have to install ytex-0.8 - instead you will have to 
> > > >>> build and install from the ytex branch to create your own 
> > > >>> distribution.  Steps
> > 2 & 3
> > > are correct.
> > > >>>
> > > >>> Although it is a pain, if you have the jdk, maven, and svn, 
> > > >>> you can easily build your own distro:
> > > >>> * open a command prompt
> > > >>> * make sure jdk, maven, and svn are in your path
> > > >>> * cd to some directory where you want to check stuff out (I 
> > > >>> like
> > > >>> c:\temp)
> > > >>> * run the following commands
> > > >>> rmdir /s /q ctakes
> > > >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex
> > > >>> ctakes cd ctakes mvn clean install -DskipTests
> > > >>>
> > > >>> And you will have the ctakes (with ytex) distro in 
> > > >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT
> > > >>> -b
> > > >>> in.z
> > > >>> ip
> > > >>>
> > > >>> What is the process for getting the ytex branch merged into trunk?
> > > >>> As I mentioned, there are very few changes to other ctakes 
> > > >>> classes/types - this should be completely complementary and 
> > > >>> not affect any existing ctakes functionality.
> > > >>>
> > > >>> -vj
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Thu, Jan 30, 2014 at 4:56 PM, <vlad.va...@gmail.com>
wrote:
> > > >>>
> > > >>>> Hi VJ--
> > > >>>>
> > > >>>> this is great!! Thanks for all the hard work on it!
> > > >>>>
> > > >>>> We're starting to look into the new install. For now we're

> > > >>>> trying the binaries out.
> > > >>>>
> > > >>>> There were these questions about the proper install steps:
> > > >>>>
> > > >>>> 1. Do we first install ytex-0.8 2. Then install the new 
> > > >>>> cTakes
> > > >>>> 3.1.1 instance and also apply the SNAPSHOT lib and resources

> > > >>>> zips 3. Work our way to install the UMLS ontologies in the
db
> > > >>>>
> > > >>>> Its is not entirely clear from the new document ( 
> > > >>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_
> > > >>>> 1?ts=1388793998&updated=Installation_cTAKES_3_1)
> > > >>>> if there's still need to install ytex-0.8, or YTEX has been

> > > >>>> entirely merged into cTakes?
> > > >>>>
> > > >>>> If the last statement is correct, there are missing parts
in 
> > > >>>> i.e the UMLS install steps that are linked from the new 
> > > >>>> ctakes
> > > >>>> 3.1.1
> > > document.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> vlad
> > > >>>>
> > > >>>>
> > > >>>> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla
wrote:
> > > >>>>>
> > > >>>>> Hello All,
> > > >>>>>
> > > >>>>> I have finished an initial cut at the port of YTEX to
cTAKES
> 3.1.1.
> > > >>>>>  Most of the YTEX functionality has been ported and 
> > > >>>>> integrated with cTAKES, and I've tested with MySQL and
MS 
> > > >>>>> SQL Server (oracle
> > > tests pending).
> > > >>>>>
> > > >>>>> Most of the changes were made in new projects - very little

> > > >>>>> existing cTAKES code has been modified.  The only 
> > > >>>>> non-trivial changes are in
> > > >>>>> /ctakes-
> > > assertion/src/main/java/org/apache/ctakes/assertion/medfac
> > > >>>>> ts/i2b2/api
> > > >>>>> - here I modified
> > > >>>>> CharacterOffsetToLineTokenConverterCtakesImpl & 
> > > >>>>> SingleDocumentProcessorCtakes to deal with newlines within

> > > >>>>> sentences correctly.  Can somebody take a look at the

> > > >>>>> changes in
> > the
> > > ytex branch?
> > > >>>>>
> > > >>>>> I believe that the branch https://svn.apache.org/ 
> > > >>>>> repos/asf/ctakes/branches/ytex is ready to be merged into

> > > >>>>> ctakes trunk, but would like other users to test it as
well.
>  Questions:
> > > >>>>>
> > > >>>>> * How can I distribute the ctakes binary distribution
to 
> > > >>>>> ytex users before the merge? Can we make the branch build

> > > >>>>> available somewhere?  The binary distribution is too large

> > > >>>>> to host on the ytex google code site (max
> > > >>>>> 200 MB)
> > > >>>>> * Non-ASF libraries - I have segregated these out into
their 
> > > >>>>> own zip file that can be distributed via sourceforge.
 As a 
> > > >>>>> stopgap, I can upload this to the ytex google code site,
but 
> > > >>>>> would prefer to upload to sourceforge.
> > > >>>>> * UMLS Derivatives - Ditto for these - would like to move
to 
> > > >>>>> sourceforge.
> > > >>>>> * Documentation - How can I update the confluence docs?
 I 
> > > >>>>> would migrate the documentation from the google code website.
> > > >>>>>
> > > >>>>> Here the installation instructions (putting the wagon
in 
> > > >>>>> front of the horse ...)
> > > >>>>>
> > > >>>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?
> > > >>>>> ts=1388793998&updated=Installation_cTAKES_3_1
> > > >>>>>
> > > >>>>> Best,
> > > >>>>>
> > > >>>>> VJ
> > > >>>>>
> > > >>>>>
> > > >>>>>  --
> > > >>>> You received this message because you are subscribed to the

> > > >>>> Google Groups "ytex-users" group.
> > > >>>> To unsubscribe from this group and stop receiving emails from

> > > >>>> it, send an email to ytex-users+...@googlegroups.com.
> > > >>>> To post to this group, send email to ytex-...@googlegroups.com.
> > > >>>> To view this discussion on the web visit 
> > > >>>> https://groups.google.com/d/ 
> > > >>>> msgid/ytex-users/70f03a80-ce1a-4c0e-b35d-5116d1c93ea0%
> > > >>>> 40googlegroups.com.
> > > >>>>
> > > >>>> For more options, visit https://groups.google.com/groups/opt_out.
> > > >>>>
> > > >>>
> > > >>>  --
> > > >> You received this message because you are subscribed to the 
> > > >> Google Groups "ytex-users" group.
> > > >> To unsubscribe from this group and stop receiving emails from 
> > > >> it, send an email to ytex-users+unsubscribe@googlegroups.com.
> > > >> To post to this group, send email to ytex-users@googlegroups.com.
> > > >> To view this discussion on the web visit
> > > >> https://groups.google.com/d/msgid/ytex-users/bc3bd705-55d2-4acd
> > > >> -
> > > a273-
> > > >> a3b1a7b36241%40googlegroups.com .
> > > >>
> > > >> For more options, visit https://groups.google.com/groups/opt_out.
> > > >>
> > > >
> > > >
> >
>

Mime
View raw message