incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Timothy" <Timothy.Mil...@childrens.harvard.edu>
Subject Re: models in jars - was FW: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release
Date Mon, 21 Jan 2013 20:42:52 GMT
Hi James,
I can't answer conclusively, but I believe those are all models trained using the clearTK
framework.  There may be a way of packaging them as separate files rather than jars but I'm
not sure if that would have any benefit since:
1) You almost always will not be/should not be modifying models trained using machine learning
methods, 
2) It may clutter up the models directory -- if you have a 10 class classifier it will build
10 one-vs-all classifiers and package them together (along with some metadata) and I think
it is valuable to encapsulate this even for many of the core ctakes developers.

As far as whether these models change or evolve, they will certainly change as features or
more data are added, but I wouldn't really say evolve, at least not in the sense that svn
can take advantage of.  A small change in the training input will change these models almost
completely so it wouldn't really be valuable to, e.g., view diffs between component model
files even if they are ascii.

Dima or Steve Bethard may be able to answer better but I'm not sure if they are checking email
today.
Tim

On Jan 21, 2013, at 2:56 PM, Masanz, James J. wrote:

> 
> Tim and Dima,
> 
> This question came up on general@incubator.apache.org regarding models included within
the source release:
> 
>> As for the models, I don' think there is any issue in keeping them in
>> jars, but the question is why?  Are they never going to evolve or change?
>> Wouldn't you keep them as source and just package them as jars for use at
>> runtime?  As I said, this is not an issue, I am just curious.
> 
> If you have any input I should include to a response to Matt F, let me know.
> 
> Otherwise my response will probably be to open a JIRA issue, with this as something to
be changed in a future release (including building the jars at build time)
> 
> -- James
> 
>> -----Original Message-----
>> From: general-return-39329-Masanz.James=mayo.edu@incubator.apache.org
>> [mailto:general-return-39329-Masanz.James=mayo.edu@incubator.apache.org]
>> On Behalf Of Matt Franklin
>> Sent: Monday, January 21, 2013 12:47 PM
>> To: general@incubator.apache.org
>> Subject: Re: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release
>> 
>> On Mon, Jan 21, 2013 at 12:16 PM, Masanz, James J.
>> <Masanz.James@mayo.edu> wrote:
>>> 
>>> Regarding the comment about compiled jars in the source tree:
>>> 
>>> The following jars, even though they are under src directories, contain
>> resources (models), not Java classes.
>>> 
>>> conll-2009-dev-shift-pop.jar
>>> dummy.dep.mod.jar
>>> mayo-dep.jar
>>> wordnet-3.0-lemma-data.jar
>>> dummy.srl.mod.jar
>>> en_srl_ontonotes.jar
>>> mayo-srl.jar
>>> clearparser_models.jar
>>> 
>>> degree_of/model.jar
>>> em_pair/model.jar
>>> modifier_extractor/model.jar
>>> 
>>> Are there other jars you were referring to?
>> 
>> ./ctakes-assertion/lib/jcarafe-core_2.9.1-0.9.8.3.RC4.jar
>> ./ctakes-assertion/lib/jcarafe-ext_2.9.1-0.9.8.3.RC4.jar
>> ./ctakes-assertion/lib/med-facts-i2b2-1.2-SNAPSHOT.jar
>> ./ctakes-assertion/lib/med-facts-zoner-1.1.jar
>> ./ctakes-constituency-parser/lib/libsvm-2.91.jar
>> ./ctakes-coreference/lib/commons-io-2.1.jar
>> ./ctakes-coreference/lib/commons-lang3-3.0.1.jar
>> ./ctakes-coreference/lib/Jama-1.0.2.jar
>> ./ctakes-coreference/lib/libsvm-2.91.jar
>> ./ctakes-dependency-parser/lib/args4j-2.0.16.jar
>> ./ctakes-dependency-parser/lib/clearparser-0.33.jar
>> ./ctakes-dependency-parser/lib/cleartk-util-0.8.1.jar
>> ./ctakes-dependency-parser/lib/commons-io-2.0.1.jar
>> ./ctakes-dependency-parser/lib/commons-lang-2.4.jar
>> ./ctakes-dependency-parser/lib/commons-logging-1.1.1.jar
>> ./ctakes-dependency-parser/lib/hppc-0.3.1.jar
>> ./ctakes-dependency-parser/lib/uimafit-1.2.0.jar
>> 
>> These are just a few that exist in SVN. All dependencies that are compiled
>> code need to be externally referenced.
>> 
>> As for the models, I don' think there is any issue in keeping them in
>> jars, but the question is why?  Are they never going to evolve or change?
>> Wouldn't you keep them as source and just package them as jars for use at
>> runtime?  As I said, this is not an issue, I am just curious.
>> 
>>> 
>>> I will look at the NOTICE file this afternoon.
>>> 
>>> Regards,
>>> James Masanz
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: general-return-39326-Masanz.James=mayo.edu@incubator.apache.org
>>>> [mailto:general-return-39326-Masanz.James=mayo.edu@incubator.apache.o
>>>> rg]
>>>> On Behalf Of Masanz, James J.
>>>> Sent: Monday, January 21, 2013 9:51 AM
>>>> To: 'general@incubator.apache.org'
>>>> Subject: RE: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release
>>>> 
>>>> The result of the VOTE for 3.0.0-incubating on the dev list is at
>>>> http://mail-archives.apache.org/mod_mbox/incubator-ctakes-
>>>> dev/201301.mbox/browser
>>>> 
>>>> The source artifact can be found in
>>>> http://people.apache.org/~chenpei/ctakes-3.0.0-incubating/rc5/target/
>>>> 
>>>> I'll look at the jars and the root NOTICE file.
>>>> 
>>>> Thanks for your review!
>>>> -- James Masanz
>>>> 
>>>>> -----Original Message-----
>>>>> From:
>>>>> general-return-39325-Masanz.James=mayo.edu@incubator.apache.org
>>>>> [mailto:general-return-39325-Masanz.James=mayo.edu@incubator.apache
>>>>> .or
>>>>> g]
>>>>> On Behalf Of Matt Franklin
>>>>> Sent: Monday, January 21, 2013 7:42 AM
>>>>> To: general@incubator.apache.org
>>>>> Cc: ctakes-dev@incubator.apache.org
>>>>> Subject: Re: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release
>>>>> 
>>>>> I have some issues with this release as it currently stands:
>>>>> 
>>>>> * Where is the result of the VOTE thread on the dev list?
>>>>> * Where is the source artifact?  The artifact linked in the vote
>>>>> thread appears to be your convenience binary release.
>>>>> * There are compiled jars in the source tree.  These need to be
>>>>> externalized in some fashion.
>>>>> * There are LICENSE & NOTICE files in individual project
>>>>> directories that contain entries that don't appear in the root
>>>>> NOTICE file.  If you intend on releasing the subcomponents
>>>>> individually, this makes some sense; but I think that the entries
>>>>> should be merged into the root NOTICE file
>>>>> 
>>>>> 
>>>>> On Fri, Jan 18, 2013 at 9:39 AM, Coarr, Matt <mcoarr@mitre.org>
>> wrote:
>>>>>> Hi, we just need one more Incubator PMC vote for cTAKES version
>> 3.0.
>>>>>> 
>>>>>> Matt
>>>>>> 
>>>>>> 
>>>>>> ---
>>>>>> From: <Chen>, Pei <Pei.Chen@childrens.harvard.edu>
>>>>>> Subject: Collecting IPMC votes
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> This is a call for a vote on releasing the following candidate as
>>>>>> Apache cTAKES 3.0.0-incubating.
>>>>>> This will be our first release.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> A vote is also held on the developer mailing list:
>>>>>> http://mail-archives.apache.org/mod_mbox/incubator-ctakes-dev/201
>>>>>> 301
>>>>>> .m
>>>>>> box/b
>>>>>> rowser
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> For more detailed information on the changes/release notes,
>>>>>> please
>>>>> visit:
>>>>>> 
>>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=
>>>>>> 123
>>>>>> 13
>>>>>> 621&v
>>>>>> ersion=12322969
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> The release was made using the cTAKES release process documented
>> here:
>>>>>> http://incubator.apache.org/ctakes/ctakes-release-guide.html
>>>>>> 
>>>>>> The candidate is available at:
>>>>>> 
>>>>>> http://people.apache.org/~chenpei/ctakes-3.0.0-incubating/rc5/tar
>>>>>> get
>>>>>> /a
>>>>>> pache
>>>>>> -ctakes-3.0.0-incubating-bin.tar.gz
>>>>>> /.zip
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> The tag to be voted on:
>>>>>> 
>>>>>> http://svn.apache.org/repos/asf/incubator/ctakes/tags/ctakes-3.0.
>>>>>> 0-i
>>>>>> nc
>>>>>> ubati
>>>>>> ng-rc5/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> The MD5 checksum of the tarball can be found at:
>>>>>> 
>>>>>> http://people.apache.org/~chenpei/ctakes-3.0.0-incubating/rc5/tar
>>>>>> get
>>>>>> /a
>>>>>> pache
>>>>>> -ctakes-3.0.0-incubating-bin.tar.gz.md5
>>>>>> /.zip.md5
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> The signature of the tarball can be found at:
>>>>>> 
>>>>>> http://people.apache.org/~chenpei/ctakes-3.0.0-incubating/rc5/tar
>>>>>> get /a pache -ctakes-3.0.0-incubating-bin.tar.gz.asc
>>>>>> /.zip.asc
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Apache cTAKES' KEYS file, containing the PGP keys used to sign
>>>>>> the
>>>>> release:
>>>>>> 
>>>>>> http://svn.apache.org/repos/asf/incubator/ctakes/tags/ctakes-3.0.
>>>>>> 0-i
>>>>>> nc
>>>>>> ubati
>>>>>> ng-rc5/KEYS
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Please vote on releasing these packages as Apache cTAKES 3.0.0-
>>>>> incubating.
>>>>>> The vote is open
>>>>>> for at least the next 72 hours.
>>>>>> 
>>>>>> Only votes from Incubator PMC are binding, but folks are welcome
>>>>>> to check the release candidate and voice their approval or
>> disapproval.
>>>>>> The vote passes if at least three binding +1 votes are cast.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> [ ] +1 Release the packages as Apache cTAKES 3.0.0-incubating [ ]
>>>>>> -1 Do not release the packages because...
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> Pei
>>>>>> 
>>>>>> P.S. Here is my +1.
>>>>>> 
>>>>> 
>>>>> -------------------------------------------------------------------
>>>>> -- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>> For additional commands, e-mail: general-help@incubator.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
> 


Mime
View raw message