ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Green <john.travis.gr...@gmail.com>
Subject Re: YTEX semantic similarity concept graph questions
Date Thu, 16 Oct 2014 12:32:59 GMT
That does, thank you, as always.

One other question: Your docs say it should take around an hour and a half
at 8g of ram for the umls... my times are turning out significantly lower
(3-5 minutes)... the *.gz output seems to be on an order of magnitude with
the included compressed concept graphs and queries seem to run OK, but it
makes me a little nervous that it is processing it that fast. Should I be
worried?

Thanks,
JG

On Thu, Oct 16, 2014 at 6:29 AM, vijay garla <vngarla@gmail.com> wrote:

> I don't know what the difference between PAR/CHD (parent/child) and RB/RN
> (broader/narrower) is supposed to be.  some umls source vocabularies use
> PAR/CHD only/predominantly (e.g. SNOMED-CT), others use RB/RN (e.g.
> RXNORM).  You can use and experiment with whatever relationships you want
> (I think there might be part of/contains relationships too).
>
> the concept graph is a directed acyclic graph, and the query should return
> parent-child edges (or maybe the other way around, not sure).  If your
> query uses e.g. rel in ('PAR', 'CHD'), you will return edges going both
> directions.  This shouldn't cause any problems, as we discard edges that
> induce cycles, but it will create a bunch of overhead for no gain.
>
> If you look at other concept graph configs, e.g.
>
> https://svn.apache.org/repos/asf/ctakes/trunk/ctakes-ytex-res/src/main/resources/org/apache/ctakes/ytex/conceptGraph/sct-rxnorm.template.xml
> ,
> you will see that we use both PAR & RB relationships.
>
> HTH,
>
> VJ
>
>
>
>
>
> On Thu, Oct 16, 2014 at 2:58 AM, John Green <john.travis.green@gmail.com>
> wrote:
>
> > Hope this finds everyone well.
> >
> > It is not immediately clear to me why
> >
> >         select distinct cui1, cui2
> >         from umls.MRREL
> >         where sab in ('SNOMEDCT')
> >         and rel in ('PAR')
> >         order by cui1, cui2
> >
> > would only be selecting the relationship (REL) of PAR. Im not sure the
> > selection criteria. This is honestly probably directed mostly at Vijay,
> but
> > anyone else with experience in this domain would be a welcome voice. In
> the
> > paper on YTEX, for instance, PAR and RB are chosen for UMLS. Why? Does
> this
> > have to do with the "flattening" or "orphaning" that UMLS does to the
> > vocabularies it includes? Why not PAR, RB, and RN? Why not more? Was
> this a
> > computational (speed/memory) consideration, or a functional one that my
> > lack of familiarity to the domain is keeping me from seeing.
> >
> > Im posting this fairly specific question to the Dev because it directly
> > relates to building YTEX concept graphs, which is a functionality of our
> > distro here.
> >
> > Best!
> > JG
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message