Return-Path: X-Original-To: apmail-incubator-ctakes-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-ctakes-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BD7E0EE6C for ; Fri, 25 Jan 2013 17:35:51 +0000 (UTC) Received: (qmail 92174 invoked by uid 500); 25 Jan 2013 17:35:51 -0000 Delivered-To: apmail-incubator-ctakes-dev-archive@incubator.apache.org Received: (qmail 92128 invoked by uid 500); 25 Jan 2013 17:35:51 -0000 Mailing-List: contact ctakes-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ctakes-dev@incubator.apache.org Delivered-To: mailing list ctakes-dev@incubator.apache.org Received: (qmail 92120 invoked by uid 99); 25 Jan 2013 17:35:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jan 2013 17:35:51 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [129.176.212.47] (HELO mail10.mayo.edu) (129.176.212.47) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jan 2013 17:35:44 +0000 Received: from roedlp003a.mayo.edu (HELO mail10.mayo.edu) ([129.176.158.13]) by ironport10-dlp.mayo.edu with ESMTP; 25 Jan 2013 11:35:23 -0600 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ag0FABHBAlGBsNQ1/2dsb2JhbABFg3W6XhZzgh4BAQEDAX4HBAIBCBEEAQEBChkEBzIUCAEIAQEEEwgBC4d1BgcFtUSJCYJKijqDWmEDlymKHYUPgneBZgkXBBo Received: from mhro1a.mayo.edu ([129.176.212.53]) by ironport10.mayo.edu with ESMTP; 25 Jan 2013 11:35:23 -0600 Received: from MSGPEXCHA07A.mfad.mfroot.org (msgpexcha07a.mayo.edu [129.176.249.223]) by mhro1a.mayo.edu with ESMTP id BT-MMP-1272354 for ctakes-dev@incubator.apache.org; Fri, 25 Jan 2013 11:35:22 -0600 Received: from MSGPEXCHA21B.mfad.mfroot.org (129.176.249.224) by MSGPEXCHA07A.mfad.mfroot.org (129.176.249.223) with Microsoft SMTP Server (TLS) id 14.2.328.9; Fri, 25 Jan 2013 11:35:22 -0600 Received: from MSGPEXCHA08A.mfad.mfroot.org ([169.254.11.82]) by MSGPEXCHA21B.mfad.mfroot.org ([169.254.1.94]) with mapi id 14.02.0328.009; Fri, 25 Jan 2013 11:35:22 -0600 From: "Masanz, James J." To: "'ctakes-dev@incubator.apache.org'" Subject: RE: [DISCUSS] no binary release of cTAKES here at Apache? FW: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release Thread-Topic: [DISCUSS] no binary release of cTAKES here at Apache? FW: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release Thread-Index: Ac36zJ/xWVRQ/Ra5RZ6YxLr9fkdJIAABsAYAAA3H54AAARgnkAAEujZw Date: Fri, 25 Jan 2013 17:35:21 +0000 Message-ID: <996FC801C05DF64A84246A106FACACD0072EF2@MSGPEXCHA08A.mfad.mfroot.org> References: <996FC801C05DF64A84246A106FACACD0072A48@MSGPEXCHA08A.mfad.mfroot.org> <996FC801C05DF64A84246A106FACACD0072C74@MSGPEXCHA08A.mfad.mfroot.org> <924DE05C19409B438EB81DE683A942D910488CCD@CHEXMBX1A.CHBOSTON.ORG> In-Reply-To: <924DE05C19409B438EB81DE683A942D910488CCD@CHEXMBX1A.CHBOSTON.ORG> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.209.18] Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Virus-Checked: Checked by ClamAV on apache.org > -----Original Message----- > From: ctakes-dev-return-1113-Masanz.James=3Dmayo.edu@incubator.apache.org > [mailto:ctakes-dev-return-1113-Masanz.James=3Dmayo.edu@incubator.apache.o= rg] > On Behalf Of Chen, Pei > Sent: Friday, January 25, 2013 9:42 AM > To: ctakes-dev@incubator.apache.org > Subject: RE: [DISCUSS] no binary release of cTAKES here at Apache? FW: > [VOTE] Apache cTAKES 3.0.0-incubating RC5 release >=20 > Based on the ongoing discussions, > Could I suggest we cancel the VOTE on RC5 and create an RC6? +1 to that > RC6 will be an extremely conservative- Again +1 to that > - No resources (models) included in src/main/java > - No resources (models) included in the -bin.tar.gz > - Move all of the models and resources to a ctakes-models projects within > the ctakes-resources on sourceforge (currently used by the UMLS resources > already). > - Update the pom.xml's to download those for developers via maven. > - End-Users will have to download and unzip a ctakes-resources.zip which > contains all of the models and resources (including UMLS). All sound good as a compromise for this release to me. > I believe this is just a temporary measure (at least a decent compromise) > until we get clarity on some of these items. > We can create subsequent releases afterwards such as a single -bin.tar.gz > that includes the models just like any other 3rd party lib,=20 The models are dependent on the outcome of https://issues.apache.org/jira/browse/LEGAL-157 Regards,=20 James Masanz > and then possibly including it in src as well. >=20 > I do not think this is a "end user friendly" issue, IMHO, it just doesn't > makes sense to separate out parts of software that are an intricate part > of the software and are always required to function properly such as > icons, gifs, jpgs, or statistical models in this case (which have been > approved to be released under ASL 2.0 terms by their contributors). >=20 > --Pei >=20 >=20 > > -----Original Message----- > > From: Masanz, James J. [mailto:Masanz.James@mayo.edu] > > Sent: Friday, January 25, 2013 10:11 AM > > To: 'ctakes-dev@incubator.apache.org' > > Subject: RE: [DISCUSS] no binary release of cTAKES here at Apache? FW: > > [VOTE] Apache cTAKES 3.0.0-incubating RC5 release > > > > > > > -----Original Message----- > > > From: > > > ctakes-dev-return-1106-Masanz.James=3Dmayo.edu@incubator.apache.org > > > [mailto:ctakes-dev-return-1106- > > Masanz.James=3Dmayo.edu@incubator.apache. > > > org] > > > On Behalf Of Mattmann, Chris A (388J) > > > Sent: Friday, January 25, 2013 2:10 AM > > > To: ctakes-dev@incubator.apache.org > > > Subject: Re: [DISCUSS] no binary release of cTAKES here at Apache? FW= : > > > [VOTE] Apache cTAKES 3.0.0-incubating RC5 release > > > > > > Hey James, > > > > > > On 1/24/13 11:55 PM, "Masanz, James J." > > wrote: > > > > > > >I posted on general@incubator that: > > > > > > > >> One goal is to have a binary that contains all resources, which > > > >> can be used to install cTAKES on a system that does not have an > > > >> internet connection. > > > >> For now we can focus on a first Apache release that doesn't meet > > > >> that goal, while pursuing the question with legal. > > > >> If legal says we can't do have that kind of binary here, then in > > > >> the future we can consider if we will host such a binary on a > > > >> different site. > > > > > > > >http://s.apache.org/bgp > > > > > > > >Another motivation for this email is a post by Benson (below) to > > > >general@incubator, where he writes "It's not the mission of the ASF > > > >to create complete, end-user-friendly, software products". > > > > > > Just to clarify -- that's Benson, talking for Roy. :) I realize that > > > this has got all skitzo lately, but just pointing out that this is > > > far from doctrine. Apache OpenOffice is a prime counter example to > > > his point and I just made that point myself. > > > > > > > > > > >I suggest we, or whoever among us are interested in such a thing, > > > >host an easy-to-install *binary* that includes cTAKES plus the > > > >models and jars, somewhere other than apache.org, that would be a > > > >single download with a simple unzip (and would be built off Apache > > > >cTAKES 3.0.0-incubating, once it is released). > > > > > > If it comes to this, I'd recommend hosting it at > > > http://apache-extras.org/ which is Google Code, but branded with > > > Apache through a special ComDev agreement set up. Products developed > > there are said to have an "affinity" > > > towards particular Apache products, but not be those Apache products. > > > Apache Extras !=3D Apache, but still is an option for those parts. > > > > > > > > > > >This binary would probably be released shortly after each Apache > > > >cTAKES release, so it could be built from the officially released > > > >Apache cTAKES source. > > > > > > Yep. I don't think the battle is over there yet though -- I liked > > > your suggestion however -- let's just roll a source release, and try > > > to push the convenience binaries as needed. > > > > > > > > > > >From my understanding, we cannot have models in SVN here if they > > > >were built from data that is not available to the community since > > > >the models are not "source". That's based on this specific comment > > > >within > > LEGAL-157: > > > >https://issues.apache.org/jira/browse/LEGAL- > > 157?focusedCommentId=3D1356 > > > >10 > > > >92& > > > >page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:comment-tabpa= n > > > >el > > > >#c > > > >omm > > > >ent-13561092 > > > > > > That's Benson's opinion, note Roy hasn't replied to him. I don't > > > read Roy's reading on the subject to be that we can't include those > > > intermediate outputs? Do you? > > > > Yes, that's the way I reading Roy's post - that it can't include > > models (intermediate outputs) because the source for those > > intermediate outputs is not being included. > > > > > >We also cannot have other compiled jars in our SVN here at > > > >apache.org, and therefore cannot be in our source release, which we > > > >are working on addressing > > > > > > That's not recommended, but also not an absolute blocker and can be > > > improved incrementally. Prior versions of Apache Lucene (and > > > anything built from Ant) had this issue and those releases shipped > just fine. > > > > That's great to know. Thanks. > > > > > > > > > >For people checking out code from SVN and using maven, those are > > > >not such big issues since maven will fetch the dependencies once we > > > >finish updating the POMs etc. > > > > > > > >If we want to allow people to download a single binary and get the > > > >cTAKES code and the models, it sounds like we either need to > > > >1) write something that would download the models for the users > > > >2) or host the binaries elsewhere > > > >(or require users to download things separately and put them > together). > > > > > > I would highly suggest #1 to avoid fragmentation. > > > > > > > > > > >I strongly dislike option 1, so I will focus on option 2 in this > > > >email, as that will be more than enough for one email any way ;) > > > > > > Why don't you like option #1? Just curious. > > > > Two reasons - a goal is to have an install that is as simple as > > possible to reduce barriers for (very busy) people to give cTAKES a > > try. (There will be times when downloading models of 100s of MB will > > fail for one reason or another on the first attempt.) > > > > And secondly, the personal experience I've had with writing > > (commercial) install code, which very often turned into a vastly more > > difficult and time consuming (testing-wise) task than people would > > allow for, and also resulted in more enduser questions than > > anticipated. Which leads to an admittedly personal bias against such > > things, if they can be avoided. But I mentioned #1 because I know my > views on #2 are partially a personal bias. > > > > > >For people to host such an all-inclusive binary elsewhere, those > > > >people would need to choose a name. > > > >We could create a logo for their use, something like "Apache cTAKES > > > >inside" or "Powered by Apache cTAKES" (see > > > >http://www.apache.org/foundation/marks/pmcs.html#poweredby) and > > make > > > >it clear the binary is not being released directly by Apache > > > >http://s.apache.org/BAj > > > > > > > >I suggest that we wouldn't need to create a convenience binary here > > > >at Apache - one less thing to test and document. > > > > > > > >This would bring up several questions though, which I'm guessing we > > > >don't want to get into here in great detail since it is really > > > >about something that is not to be released directly from Apache. > > > > - what to call the binary (we would not simply be able to call it > > > >"Apache cTAKES") > > > > - where to host the binary (I'd suggest the ohnlp sourceforge > > > >project, where previous versions of cTAKES live) > > > > - we would need a place to hold the documentation for this binary. > > > >I am assuming we could not host it as apache.org, but we would need > > > >that either confirmed here or create a legal Jira to get that > confirmation. > > > > - where would we tell people to go to post questions about the > binary? > > > > - where would the build of the binary take place > > > > > > > >I suggest taking those questions offline unless someone tells me > > > >those things are indeed OK to discuss here. > > > > > > > >My main point to discuss here is whether there is enough value in > > > >providing a convenience binary of Apache cTAKES here at apache.org > > > >(which would not contain the models) for us to create and support > > > >it here, or if we skip creating binary here at apache.org and only > > > >create source packages here. > > > > > > > >I am not trying to splinter the group here. I would hope anyone > > > >involved in producing the binary would be involved here with Apache > > > cTAKES too. > > > >But there might be people involved in Apache cTAKES that aren't > > > >interested in the details of how a binary is produced or what it > > > >looks like, or even if it is produced. > > > > > > That's a possibility but brings with a whole horde of other legal > > > mumbo jumbo (and trademarks@) that trust me you don't want to go > > down (yet). > > > Maybe ever :) > > > > > > Try and focus on #1 -- I bet it's achievable without all the > > > convenience binaries part. Would that work for the community? > > > > We have previously (before Apache) received lots of positive end user > > feedback about what an improvement providing an all-inclusive binary > > was for them. > > Not providing it is a step backward for us. > > > > > Cheers, > > > Chris > > > > > > > >-- James > > > > > > > > -- James > > > > > >> -----Original Message----- > > > >> From: > > > >> general-return-39392-Masanz.James=3Dmayo.edu@incubator.apache.org > > > >> [mailto:general-return-39392- > > Masanz.James=3Dmayo.edu@incubator.apache > > > >> .o > > > >> rg] > > > >> On Behalf Of Benson Margulies > > > >> Sent: Thursday, January 24, 2013 9:23 PM > > > >> To: general@incubator.apache.org > > > >> Subject: Re: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release > > > >> > > > >> It's unfortunate to have this conversation in parallel here and > > > >> on https://issues.apache.org/jira/browse/LEGAL-157. > > > >> > > > >> Also, this thread is a combo of the discussion of ordinary > > > >>jars-of-classes (where I'd forgotten the policy) and the much > > > >>more tangled question of models, which is what the JIRA is > wrestling with. > > > >> > > > >> To answer Ted, I think that Roy might write something like: > > > >> > > > >> "It's not the mission of the ASF to create complete, > > > >>end-user-friendly, software products. It's our mission to create > > > >>open source code. If someone else wants to build up an > > > >>end-user-friendly aggregation of ASF code and models from bombs > > > >>of whatever, that's great, and we encourage them." > > > >> > > > >> On Thu, Jan 24, 2013 at 8:19 PM, Branko =C8ibej > > wrote: > > > >> > On 25.01.2013 01:50, Ted Dunning wrote: > > > >> >> On Fri, Jan 25, 2013 at 7:37 AM, Branko =C8ibej > > > >> >> > > > >>wrote: > > > >> >> > > > >> >>> On 21.01.2013 21:08, Benson Margulies wrote: > > > >> >>> ...>> > > > >> >>>>> I am referring to this discussion http://s.apache.org/MUZ > > > >> >>>> Well, that clear enough, even if it is a typical example of > > > >> >>>> how our founders yell at us but we have no mechanism to > > > >> >>>> channel those yells into concise, unambiguous, documentation. > > > >> >>> Per haps off-topic ... but I fail to see how "source release" > > > >> >>> is ambiguous or not concise. > > > >> >>> > > > >> >>> Unless the Java world has a different definition of "source > code" > > > >> >>> than us stuck-in-the-mud plodders, and it's only considered > > > >> >>> binary once it's been JIT-compiled. :) > > > >> >>> > > > >> >> > > > >> >> It isn't necessarily ambiguous when applied to code, but there > > > >> >> is a different case when applied to models or parameter > settings. > > > >> >> > > > >> >> For instance, commons match has polynomial coefficients > > > >> >> embedded in code that approximate certain functions. These > > > >> >> are the results of computations done using other systems and > > > >> >> the source code and the data used in those other computations > > > >> >> are not included in the released code, only the parameter value= s > are. > > > >> >> > > > >> >> This same sort of thing applies here except that the model in > > > >> >> question has a much larger set of values and is being packaged > > > >> >> in a binary, inspectable format. Would your opinion change if > > > >> >> the model were expressed in a textual model? Would it matter > > > >> >> that the textual model is too large and obtuse to usefully > inspect? > > > >> > > > > >> > In cases like this one, it would seem reasonable for the source > > > >> > code to refer to those models and computations, which > > > >> > presumably anyone can then reproduce to their own satisfaction. > > > >> > This is unlike compiled code in that compilation results are > > > >> > notoriously hard to reproduce exactly, because they depend on > > > >> > many factors that are usually hard to document, let alone > > > >> > reproduce. I'd expect a mathematical model, no matter how > > > >> > large, does not suffer from such > > > ambiguities (and shut up, G=F6del). > > > >> > > > > >> > However, that's beside the point, because ... > > > >> > > > > >> >> What about a hypothetical case where the model is derived from > > > >> >> the explosion of a nuclear bomb? Would the release of the > > > >> >> numbers require the inclusion of a suitable bomb design so > > > >> >> that everybody could replicate the derivation? > > > >> > > > > >> > ... the issue is not about the exposing all the knowledge that > > > >> > goes into writing the code, but to expose the code itself so > > > >> > that it can be reviewed for, e.g., back-doors and other security > issues. > > > >> > Neither of your examples is relevant. > > > >> > > > > >> > -- Brane > > > >> > > >