ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wu, Stephen T., Ph.D." <Wu.Step...@mayo.edu>
Subject Re: Common Type System across systems?
Date Tue, 01 Oct 2013 14:39:38 GMT
Pei et al,
That was the vision for the SHARP "common type system", except it was
meant to include medical-related projects rather than general projects.

Steve's process below is probably the most realistic way to do things, and
it's basically how we did the current cTAKES type system.  Unfortunately,
the "someone" doing #1 was me, and I didn't realize that it would be quite
difficult.  I guess I know more about how to do it now but #1 and #2 were
surprisingly harder than I expected.  I'm adding a #4:

(1) Have someone inspect the various type systems closely and make a
proposal
  A. Know each of the type systems on their own.  Essential to visualize
them appropriately, but it is still difficult to understand the
implications of type changes just by looking. (By the way, we never came
up with a really great automatic visualization tool, closest was a Protégé
plugin. Excellent visualization would go a long way, especially if edits
were possible.)
  B. Categorize portions of type systems to compare and take them a step
at a time.
  C. Clearly limit which type systems you are going to consider for your
comparison and reconciliation.
  D. Pick a starting point.  I found it nearly impossible to create from
scratch when you're staring at 4-5 other type systems.  We started from
the old cTAKES type system but that did cause some bias!
  E. Develop real criteria (or at least opinions) for choosing between the
many options.

(2) Agree on the proposal.
  A. Multiple projects should make a binding agreement to implement. This
means, most likely, that they somebody needs to have assurance of funding.
 In our case, we only made it binding for cTAKES, so it is only used by
cTAKES (as far as I know).
  B. With different projects' vested interests on the line, have some real
discussions of what your project is going to give up with the proposed
stuff.

(3) Spend the time to re-write all the code to use the new type system.
  * As Steve said, this is time-consuming, especially if things get broken
and models need to be retrained, etc.

(4) Ensure maintenance and modifiability across projects.
  A. The original SHARP common type system vision handed off the
maintenance to the Software Development Group, but that never really
happened. I hope the Apache community can serve as this to some degree,
but so far it has still depended on unreliable people like myself.
  B. A means of having everyone automatically draw from the same source
code would be preferable.
  C. If, in the future, you need to consider another UIMA project whose
type system should be reconciled... Well, that's happening right now.  I
guess you can worry about it when you get there if you have a community
that's willing to deal with it.


Those are just some thoughts.  It's not impossible, but neither is it
simple.

stephen




On 9/30/13 8:17 PM, "Steven Bethard" <steven.bethard@gmail.com> wrote:

>We (ClearTK) talked with Richard (DKPro) about doing this for ClearTK
>and DKPro. Basically, both groups were all for it, but the main issue
>was time. Basically you need to:
>
>(1) Have someone inspect the various type systems closely and make a
>proposal
>(2) Agree on the proposal.
>(3) Spend the time to re-write all the code to use the new type system.
>
>Step (3) is especially time consuming, but in fact, we never managed
>to get the free time for step (1).
>
>That all said, ClearTK would love to share a common type system with
>other projects.
>
>Steve
>
>
>On Mon, Sep 30, 2013 at 7:38 PM, Pei Chen <chenpei@apache.org> wrote:
>> Richard, I, and few others had an interesting bar conversation...
>> In the spirit of interoperability, What if we had a baseline common type
>> system that could be reused across UIMA compatible NLP systems?
>> Imagine for a moment that OpenNLP, ClearTK, ClearNLP, DKPro, cTAKES
>>etc. if
>> we could come up with a common baseline type system could be be reused?
>> It
>> may sound like a dream, but it could be doable-- if we could factor out
>>and
>> find the common ground? Perhaps we could start with the syntactical
>> features... and then extend it for more specific domain use cases?
>>
>> --Pei


Mime
View raw message