ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pei Chen <chen...@apache.org>
Subject Re: Common Type System across systems?
Date Tue, 01 Oct 2013 19:06:05 GMT
Yes, I think this is slight augmentation and extension of the original
vision of the clinical common type system- by having it work with other
UIMA based NLP system.  Having worked on item (3) for cTAKES, I actually
think the tough part will be getting consensus and agreement on a system
between all parties and less on the required code changes.  Hence, just
wanted to ping the community to gauge interest and see if this actually
makes sense [It would be nice to plug in different POSTaggers or example
without having to remap types].
If we have a willing volunteer (Richard :)?) to perform some of the prelim
analysis Q1 2014 with our existing type system, perhaps we can actually
make this happen.

4a) I think the SHARP4 development group has essentially moved to the
cTAKES ASF community which is probably even better since it already has a
meritocratic/governance mechanism to handle changes.

On Tue, Oct 1, 2013 at 10:39 AM, Wu, Stephen T., Ph.D.

> Pei et al,
> That was the vision for the SHARP "common type system", except it was
> meant to include medical-related projects rather than general projects.
> Steve's process below is probably the most realistic way to do things, and
> it's basically how we did the current cTAKES type system.  Unfortunately,
> the "someone" doing #1 was me, and I didn't realize that it would be quite
> difficult.  I guess I know more about how to do it now but #1 and #2 were
> surprisingly harder than I expected.  I'm adding a #4:
> (1) Have someone inspect the various type systems closely and make a
> proposal
>   A. Know each of the type systems on their own.  Essential to visualize
> them appropriately, but it is still difficult to understand the
> implications of type changes just by looking. (By the way, we never came
> up with a really great automatic visualization tool, closest was a Protégé
> plugin. Excellent visualization would go a long way, especially if edits
> were possible.)
>   B. Categorize portions of type systems to compare and take them a step
> at a time.
>   C. Clearly limit which type systems you are going to consider for your
> comparison and reconciliation.
>   D. Pick a starting point.  I found it nearly impossible to create from
> scratch when you're staring at 4-5 other type systems.  We started from
> the old cTAKES type system but that did cause some bias!
>   E. Develop real criteria (or at least opinions) for choosing between the
> many options.
> (2) Agree on the proposal.
>   A. Multiple projects should make a binding agreement to implement. This
> means, most likely, that they somebody needs to have assurance of funding.
>  In our case, we only made it binding for cTAKES, so it is only used by
> cTAKES (as far as I know).
>   B. With different projects' vested interests on the line, have some real
> discussions of what your project is going to give up with the proposed
> stuff.
> (3) Spend the time to re-write all the code to use the new type system.
>   * As Steve said, this is time-consuming, especially if things get broken
> and models need to be retrained, etc.
> (4) Ensure maintenance and modifiability across projects.
>   A. The original SHARP common type system vision handed off the
> maintenance to the Software Development Group, but that never really
> happened. I hope the Apache community can serve as this to some degree,
> but so far it has still depended on unreliable people like myself.
>   B. A means of having everyone automatically draw from the same source
> code would be preferable.
>   C. If, in the future, you need to consider another UIMA project whose
> type system should be reconciled... Well, that's happening right now.  I
> guess you can worry about it when you get there if you have a community
> that's willing to deal with it.
> Those are just some thoughts.  It's not impossible, but neither is it
> simple.
> stephen
> On 9/30/13 8:17 PM, "Steven Bethard" <steven.bethard@gmail.com> wrote:
> >We (ClearTK) talked with Richard (DKPro) about doing this for ClearTK
> >and DKPro. Basically, both groups were all for it, but the main issue
> >was time. Basically you need to:
> >
> >(1) Have someone inspect the various type systems closely and make a
> >proposal
> >(2) Agree on the proposal.
> >(3) Spend the time to re-write all the code to use the new type system.
> >
> >Step (3) is especially time consuming, but in fact, we never managed
> >to get the free time for step (1).
> >
> >That all said, ClearTK would love to share a common type system with
> >other projects.
> >
> >Steve
> >
> >
> >On Mon, Sep 30, 2013 at 7:38 PM, Pei Chen <chenpei@apache.org> wrote:
> >> Richard, I, and few others had an interesting bar conversation...
> >> In the spirit of interoperability, What if we had a baseline common type
> >> system that could be reused across UIMA compatible NLP systems?
> >> Imagine for a moment that OpenNLP, ClearTK, ClearNLP, DKPro, cTAKES
> >>etc. if
> >> we could come up with a common baseline type system could be be reused?
> >> It
> >> may sound like a dream, but it could be doable-- if we could factor out
> >>and
> >> find the common ground? Perhaps we could start with the syntactical
> >> features... and then extend it for more specific domain use cases?
> >>
> >> --Pei

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message