ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vijay garla <vnga...@gmail.com>
Subject Re: YTEX cTAKES 3.1.1 ready
Date Thu, 06 Feb 2014 18:05:08 GMT
The cTAKES sentence detector is not changed in the YTEX branch.  The YTEX
branch has an *additional* sentence detector that does not automatically
split sentences on newlines - users can use this if they like.

-vj


On Thu, Feb 6, 2014 at 1:01 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Vijay,
>
> >  I have yet to run across clinical text from a real EMR where newlines
> represent the end of a sentence
>
> Since James pointed out this possibility a couple weeks ago, I have kept
> my eyes open.  The problem is pretty ubiquitous in a corpus that I'm
> working with right now.  I just opened the first note and gave it a count
> ... 95 lines total, 9 are sentence/phrase (lacking punctuation) endings.
>  This is not including lists, which comprise about half of the note.
> One possible conjoinment was "Will consider [...] biopsy\nGiven [...]".
>  Depending upon how cTakes deals with it, the meaning could change
> drastically.
>
> > I believe cTAKES absolutely has to support sentences with newlines
> within them
>
> Yes, cTakes should do so, but I hope that you aren't suggesting that it
> only support such a structure.
>
> Where is that easy button?
>
> -----Original Message-----
> From: vijay garla [mailto:vngarla@gmail.com]
> Sent: Thursday, February 06, 2014 10:31 AM
> To: dev@ctakes.apache.org
> Cc: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org;
> vlad.valtchinov@gmail.com
> Subject: Re: YTEX cTAKES 3.1.1 ready
>
> I believe it is worth migrating to trunk.
>
> Note that the sentence detector is also complementary - the existing
> ctakes sentence detector is unchanged - users can choose which sentence
> detector to use.  There are changes to assertion & dependency parsing to
> support sentences without newlines, and that works with both sentence
> detectors.
>
> I believe cTAKES absolutely has to support sentences with newlines within
> them - I have yet to run across clinical text from a real EMR where
> newlines represent the end of a sentence - the changes to assertion &
> dependency parsing will have to be done at some point.
>
> -vj
>
>
> On Thu, Feb 6, 2014 at 10:19 AM, Chen, Pei
> <Pei.Chen@childrens.harvard.edu>wrote:
>
> > VJ,
> > Aside from the changes to the existing cTAKES code (sentence detector,
> > etc.) [which we could leave out if it's still being debated], Do you
> > think it's worth migrating the ytex code to trunk at this point?
> >  As you mentioned earlier, it's largely complementary.
> > [I was just thinking of saving effort to maintain the separate branch
> > and for simplicity for dev...]
> >
> > --Pei
> >
> > > -----Original Message-----
> > > From: vijay garla [mailto:vngarla@gmail.com]
> > > Sent: Wednesday, February 05, 2014 9:30 PM
> > > To: ytex-users@googlegroups.com; ctakes-dev@incubator.apache.org;
> > > vlad.valtchinov@gmail.com
> > > Subject: Re: YTEX cTAKES 3.1.1 ready
> > >
> > > Hi Vlad,
> > >
> > > I Updated the umls install guide; see
> > > https://code.google.com/p/ytex/wiki/UMLS_SQL_SERVER_3_1
> > >
> > > I would prefer to add the docs in the ctakes confluence, but as far
> > > as I
> > can
> > > tell, I don't have write access there - can somebody give me write
> > privileges
> > > on the ctakes confluence site?
> > >
> > > There was a bug in the umls install; copy
> > > https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-
> > > ytex/scripts/data/build.xmlover
> > > the corresponding file in your ctakes-3.1.2 install
> > > (CTAKES_HOME\bin\ctakes-ytex\scripts\data) and you should be set.
> > > The import is currently running on the UMLS 2013AA (I assume this
> > > will
> > complete
> > > without issues as long as the umls schema hasn't changed from 2012).
> > >
> > > what trial and error did you have to go through to build the distro?
> > >
> > > -vj
> > >
> > >
> > > On Wed, Feb 5, 2014 at 5:33 PM, vijay garla <vngarla@gmail.com> wrote:
> > >
> > > > Hi Vlad,
> > > >
> > > > sorry that the instructions aren't clear.
> > > >
> > > > re 1) What I am trying to say is install
> > > > apache-ctakes-3.2.0-snapshot as usual (this is unchanged from
> > > > 3.1.1).  After that you still have to apply the lib and resources
> > > > (these are things that cannot be distributed via apache).
> > > >
> > > > re 2) Yes, I need to update those docs.  Hopefully will get to
> > > > that at some point.  However, I assume you already have a UMLS DB
> > > > (also assume SQL Server).  If you can't/don't want to use your
> > > > existing umls DB, please tell me.  The I'll priortize upgrading
> > > > the doc on importing the umls tables (the scripts are there).
> > > >
> > > > best,
> > > >
> > > > VJ
> > > >
> > > >
> > > > On Wed, Feb 5, 2014 at 4:44 PM, <vlad.valtchinov@gmail.com> wrote:
> > > >
> > > >> Hi VJ-
> > > >>
> > > >> so, with trial and error were able to make the distribution and
> > > >> now have the apache-ctakes-3.1.2-SNAPSHOT-bin.zip archive.
> > > >>
> > > >> Here's what's unclear.
> > > >>
> > > >> 1. Is now this the only (combined) thing that you need for ctakes
> > > >> 3.1.1 + Ytex?
> > > >> the current documentation (https://code.google.com/p/yte
> > > >> x/wiki/Installation_cTAKES_3_1?ts=1388793998&updated=Instal
> > > >> lation_cTAKES_3_1)
> > > >> which most probably is outdated, talks about installing cTakes
> > > >> 3.1.1 first and then applying 2 SNAPSHOT archives (downloadable)
> > > >> , lib and resources.
> > > >> This is a confusion point.
> > > >>
> > > >> 2. The directions to import UMLS subset are then outdated as well.
> > > >> Maybe one should use the old version (ctakes 2.5 and ytex 0.8) to
> > > >> import the RRF files for the UMLS subset and then just use the
> > > >> resulting db. Thoughts?
> > > >>
> > > >> Thanks,
> > > >> Vlad Valtchinov
> > > >> Brigham Rad
> > > >>
> > > >>
> > > >> On Thursday, January 30, 2014 5:17:43 PM UTC-5, vijay garla wrote:
> > > >>
> > > >>> Hi Vlad,
> > > >>>
> > > >>>
> > > >> All of ytex has been moved into ctakes, it is currently in a
> > > >> branch (
> > > >>> https://svn.apache.org/repos/asf/ctakes/branches/ytex).  You
> > > >>> don't have to install ytex-0.8 - instead you will have to build
> > > >>> and install from the ytex branch to create your own
> > > >>> distribution.  Steps
> > 2 & 3
> > > are correct.
> > > >>>
> > > >>> Although it is a pain, if you have the jdk, maven, and svn, you
> > > >>> can easily build your own distro:
> > > >>> * open a command prompt
> > > >>> * make sure jdk, maven, and svn are in your path
> > > >>> * cd to some directory where you want to check stuff out (I like
> > > >>> c:\temp)
> > > >>> * run the following commands
> > > >>> rmdir /s /q ctakes
> > > >>> svn co https://svn.apache.org/repos/asf/ctakes/branches/ytex
> > > >>> ctakes cd ctakes mvn clean install -DskipTests
> > > >>>
> > > >>> And you will have the ctakes (with ytex) distro in
> > > >>> ctakes\ctakes-distribution\target\apache-ctakes-3.1.2-SNAPSHOT-b
> > > >>> in.z
> > > >>> ip
> > > >>>
> > > >>> What is the process for getting the ytex branch merged into trunk?
> > > >>> As I mentioned, there are very few changes to other ctakes
> > > >>> classes/types - this should be completely complementary and not
> > > >>> affect any existing ctakes functionality.
> > > >>>
> > > >>> -vj
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Thu, Jan 30, 2014 at 4:56 PM, <vlad.va...@gmail.com>
wrote:
> > > >>>
> > > >>>> Hi VJ--
> > > >>>>
> > > >>>> this is great!! Thanks for all the hard work on it!
> > > >>>>
> > > >>>> We're starting to look into the new install. For now we're
> > > >>>> trying the binaries out.
> > > >>>>
> > > >>>> There were these questions about the proper install steps:
> > > >>>>
> > > >>>> 1. Do we first install ytex-0.8 2. Then install the new cTakes
> > > >>>> 3.1.1 instance and also apply the SNAPSHOT lib and resources
> > > >>>> zips 3. Work our way to install the UMLS ontologies in the
db
> > > >>>>
> > > >>>> Its is not entirely clear from the new document (
> > > >>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_
> > > >>>> 1?ts=1388793998&updated=Installation_cTAKES_3_1)
> > > >>>> if there's still need to install ytex-0.8, or YTEX has been
> > > >>>> entirely merged into cTakes?
> > > >>>>
> > > >>>> If the last statement is correct, there are missing parts
in
> > > >>>> i.e the UMLS install steps that are linked from the new ctakes
> > > >>>> 3.1.1
> > > document.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> vlad
> > > >>>>
> > > >>>>
> > > >>>> On Friday, January 3, 2014 10:21:52 PM UTC-5, vijay garla
wrote:
> > > >>>>>
> > > >>>>> Hello All,
> > > >>>>>
> > > >>>>> I have finished an initial cut at the port of YTEX to
cTAKES
> 3.1.1.
> > > >>>>>  Most of the YTEX functionality has been ported and integrated
> > > >>>>> with cTAKES, and I've tested with MySQL and MS SQL Server
> > > >>>>> (oracle
> > > tests pending).
> > > >>>>>
> > > >>>>> Most of the changes were made in new projects - very little
> > > >>>>> existing cTAKES code has been modified.  The only non-trivial
> > > >>>>> changes are in
> > > >>>>> /ctakes-
> > > assertion/src/main/java/org/apache/ctakes/assertion/medfac
> > > >>>>> ts/i2b2/api
> > > >>>>> - here I modified
> > > >>>>> CharacterOffsetToLineTokenConverterCtakesImpl &
> > > >>>>> SingleDocumentProcessorCtakes to deal with newlines within
> > > >>>>> sentences correctly.  Can somebody take a look at the
changes
> > > >>>>> in
> > the
> > > ytex branch?
> > > >>>>>
> > > >>>>> I believe that the branch https://svn.apache.org/
> > > >>>>> repos/asf/ctakes/branches/ytex is ready to be merged into
> > > >>>>> ctakes trunk, but would like other users to test it as
well.
>  Questions:
> > > >>>>>
> > > >>>>> * How can I distribute the ctakes binary distribution
to ytex
> > > >>>>> users before the merge? Can we make the branch build available
> > > >>>>> somewhere?  The binary distribution is too large to host
on
> > > >>>>> the ytex google code site (max
> > > >>>>> 200 MB)
> > > >>>>> * Non-ASF libraries - I have segregated these out into
their
> > > >>>>> own zip file that can be distributed via sourceforge.
 As a
> > > >>>>> stopgap, I can upload this to the ytex google code site,
but
> > > >>>>> would prefer to upload to sourceforge.
> > > >>>>> * UMLS Derivatives - Ditto for these - would like to move
to
> > > >>>>> sourceforge.
> > > >>>>> * Documentation - How can I update the confluence docs?
 I
> > > >>>>> would migrate the documentation from the google code website.
> > > >>>>>
> > > >>>>> Here the installation instructions (putting the wagon
in front
> > > >>>>> of the horse ...)
> > > >>>>>
> > > >>>>> https://code.google.com/p/ytex/wiki/Installation_cTAKES_3_1?
> > > >>>>> ts=1388793998&updated=Installation_cTAKES_3_1
> > > >>>>>
> > > >>>>> Best,
> > > >>>>>
> > > >>>>> VJ
> > > >>>>>
> > > >>>>>
> > > >>>>>  --
> > > >>>> You received this message because you are subscribed to the
> > > >>>> Google Groups "ytex-users" group.
> > > >>>> To unsubscribe from this group and stop receiving emails from
> > > >>>> it, send an email to ytex-users+...@googlegroups.com.
> > > >>>> To post to this group, send email to ytex-...@googlegroups.com.
> > > >>>> To view this discussion on the web visit
> > > >>>> https://groups.google.com/d/
> > > >>>> msgid/ytex-users/70f03a80-ce1a-4c0e-b35d-5116d1c93ea0%
> > > >>>> 40googlegroups.com.
> > > >>>>
> > > >>>> For more options, visit https://groups.google.com/groups/opt_out.
> > > >>>>
> > > >>>
> > > >>>  --
> > > >> You received this message because you are subscribed to the
> > > >> Google Groups "ytex-users" group.
> > > >> To unsubscribe from this group and stop receiving emails from it,
> > > >> send an email to ytex-users+unsubscribe@googlegroups.com.
> > > >> To post to this group, send email to ytex-users@googlegroups.com.
> > > >> To view this discussion on the web visit
> > > >> https://groups.google.com/d/msgid/ytex-users/bc3bd705-55d2-4acd-
> > > a273-
> > > >> a3b1a7b36241%40googlegroups.com
> > > >> .
> > > >>
> > > >> For more options, visit https://groups.google.com/groups/opt_out.
> > > >>
> > > >
> > > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message