Return-Path: X-Original-To: apmail-incubator-ctakes-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-ctakes-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B1E00D115 for ; Tue, 19 Feb 2013 22:47:31 +0000 (UTC) Received: (qmail 17409 invoked by uid 500); 19 Feb 2013 22:47:31 -0000 Delivered-To: apmail-incubator-ctakes-dev-archive@incubator.apache.org Received: (qmail 17363 invoked by uid 500); 19 Feb 2013 22:47:31 -0000 Mailing-List: contact ctakes-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ctakes-dev@incubator.apache.org Delivered-To: mailing list ctakes-dev@incubator.apache.org Received: (qmail 17355 invoked by uid 99); 19 Feb 2013 22:47:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Feb 2013 22:47:31 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mcmurry.andy@gmail.com designates 209.85.215.50 as permitted sender) Received: from [209.85.215.50] (HELO mail-la0-f50.google.com) (209.85.215.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Feb 2013 22:47:24 +0000 Received: by mail-la0-f50.google.com with SMTP id ec20so7095703lab.37 for ; Tue, 19 Feb 2013 14:47:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=4rX/H8ht6QdaOll5yyZBV2hSKtlvhCNzV1p7WnjjD34=; b=ru4SJ/Vtv/whT90W6ecRKIalh5tvytL7hdwUdUJw/mGgNXsXVgaN5qabM4GrEqish8 v2cusVirPwO9oPRi7tWbKQl3dqpY4zZAp+CO8bFbZYGk+IEg4f8Jmgh3vgIqIH+FoTaJ NuovDrJb4o5iy0gnn3EjYT5WjGnP9Al3q7XluSAlNCBfk3eObOcsGjVwk5mH+tzM9Bzw qHa6pyZdQeAGJZEXb+s0SuxC2UbSre4Q5hat+9y3B0Ztu72lG19ZG36hzSOY9CNDbx7h taHYYZpFkhAATlYfyR3BUKUid3KRopEYTCgloovQqAioT0ajyTUeFC/qbw1R1ehO9a1Z u+KA== MIME-Version: 1.0 X-Received: by 10.152.109.84 with SMTP id hq20mr10639549lab.48.1361314023054; Tue, 19 Feb 2013 14:47:03 -0800 (PST) Received: by 10.112.24.101 with HTTP; Tue, 19 Feb 2013 14:47:02 -0800 (PST) In-Reply-To: References: <924DE05C19409B438EB81DE683A942D91049BDFD@CHEXMBX1A.CHBOSTON.ORG> <996FC801C05DF64A84246A106FACACD007D19C@MSGPEXCHA08A.mfad.mfroot.org> Date: Tue, 19 Feb 2013 17:47:02 -0500 Message-ID: Subject: Re: cTAKES 3.0.0 Feedback Was: RE: [DISCUSS] Graduate cTAKES from Incubator From: andy mcmurry To: ctakes-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=bcaec54c50f0c59bfd04d61b9ecc X-Virus-Checked: Checked by ClamAV on apache.org --bcaec54c50f0c59bfd04d61b9ecc Content-Type: text/plain; charset=ISO-8859-1 Thank you Troy! *Summary: what is the purpose of the 3.0.0 release: changing the license to Apache or getting new users? * Releasing 3.0.0 without DOCS is OK so long as the expected user base is CURRENT cTAKES users. If that is the case and this transition is 100% about changing the license to Apache, then OK. NEW users coming to cTAKES will probably be overwhelmed, for all the reasons discussed. We will likely "lose" these new users who will not come back when the docs are ready a month later. *Question for the group : who is the intended audience of the 3.0.0 release? * On Tue, Feb 19, 2013 at 5:11 PM, Bleeker, Troy C. wrote: > Summarizing where we are now ... completed items at the bottom of the list > for reference only. > > The community decided to release cTAKES 3.0 without the doc being complete > - these must be next: > - The Developer Guide is not complete. > [TODO] Dev Guide needs command line install instructions for UMLS ID/pw > and classpath. I'll work on this. > - The User Guide has a caveat on the table at the bottom of the > instructions because a similar set of examples was not distributed like it > was in 2.5. The instructions are longer as well since the user could not > just load and run existing samples. > [TODO] Consider shipping test data resources from SourceForge in a ZIP > file. > - The Getting Started page needs to be written in context of all future > releases not just 3.0. > [NEEDS REVIEW] A page was written. Is it as expected? > - Previous releases list. We need to both point to the NCI sites for 2.6 > and back plus create an archive for what will be the history of Apache > releases. Needs work your right. > [TODO] For now I removed 2 of the 3 links leaving only one that points > back to NCI for 2.5 and back. Question is should we have a full listing of > the 2.5 and previous releases on the Apache site or simply point to the NCI > wiki. If point to NCI then there is no archive to be had yet since the only > release in Apache is the current release. > - Component Use Guide pages needing updates. There are items marked in > reddish color that are incorrect or in need of updates on these pages: > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Core > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Dependency+Parser+and+Semantic+Role+Labeler > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Drug+Named+Entity+Recognition > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+NE+Contexts > > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+PAD+Term+Spotter > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+POS+Tagger > [TODO] Component knowledgeable people must update these pages. > > Potential priorities after that: > - The examples, as described by Andy, would be more than a readme should > have. This would be great for a how-to guide. The Developer Guide and User > Guide have historically been install guides not how-to guides. I don't > think a how-to guide should be incorporated into these but should be its > own document. > [TODO] Should the current user and dev guides be renamed? > - cTAKES has never had a how-to guide that I know of. Making one would be > great and as you say should include things like 1) pointers to where to > find basic information 2) very high level overview of the components in the > context of using them to do a very basic task like 3) I think it was > suggested that the Getting Started page might be something like this in > very short form. If we did that then it would point to a more comprehensive > how-to guide. > [TODO] Decide if we are going to do this. > - Project history page of all cTAKES releases placed on Apache sites > somewhere. Good plan if short. I would not copy readmes there but have > links to them. > [TODO] This was done in the past but removed from the bottom of the > downloads page. This page exists now but is not linked to from the Apache > cTAKES site. Here is a direct link: > http://incubator.apache.org/ctakes/roadmap.html It would need 3.0 info > added if we decide to use it. > - Creating a single download for a newcomer. > [LATER] This has been discussed and tabled by the community for the time > being in order to get the 3.0 release done and out the door. We need to > come back to this in order to make the best first impression. > > Completed: > - The downloads page must work. It now seems to function alright ***IF*** > you refresh the page or select a mirror and click the Change button. If you > do neither and try to download you get this error: "The requested URL > /ctakes/[preferred]incubator/ctakes/apache-ctakes-3.0.0-incubating-bin.tar.gz > was not found on this server." Anyone have time to fix this? > [WORK-AROUND] Seems intermittent. Tried 5 different machines. James and > Troy changed the downloads page to tell the user to select the Change > button when they have issues. It should work if they do that. Best guess - > the randomly selected mirror sites do not all work. Also, selecting a site > in the drop-down and pressing the Change button does not set the mirror > site to the one you selected. Next best guess - other Apache sites have a > double // in the URL just after the mirror domain in the file download > link. Maybe this is required. Tried this too. > - Adding a link to the install instructions makes it obvious (which I have > done to the page) but it was there in a sense. > [DONE] Link added. > - "Last official release" was held until now. Since 3.0 is going to be > officially announced, 3.0 will go there. I made that change as well. > [DONE] Reworked the page. > - A list of changed features has not been high priority since the original > goal was to make a 3.0 in Apache that pretty much matched the function of > 2.5. The only thing that changed was how the product is built and shipped. > Nevertheless we need to state at least that somewhere. > [DONE] That was wrong, there is new function. Relation Extractor now > documented on the downloads page. > - The resources file is 1.1 GB not 2 right? > [DONE] Andy said it in an email. The web site lists it fine. > - There are still 3.0 developer and user guide pages on the cTAKES home > site that should be removed so no one stumbles on to them. > [DONE] Removed. > - Where would a newcomer hit first? Internet search for "ctakes" or > "ctakes 3.0" is probably first. Top hits on those lists should be modified > to point to the best Apache cTAKES landing page. > [DONE] James and Troy made changes to the top hit pages and other places > that made sense. > - The current guides are still not complete. > [DONE] User Guide James and Troy went through. > > Thanks > Troy > > -----Original Message----- > From: ctakes-dev-return-1250-Bleeker.Troy=mayo.edu@incubator.apache.org[mailto: > ctakes-dev-return-1250-Bleeker.Troy=mayo.edu@incubator.apache.org] On > Behalf Of Masanz, James J. > Sent: Tuesday, February 19, 2013 10:10 AM > To: 'ctakes-dev@incubator.apache.org' > Subject: RE: cTAKES 3.0.0 Feedback Was: RE: [DISCUSS] Graduate cTAKES from > Incubator > > > > - The resources file is 1.1 GB not 2 right? > > I agree. But I don't see it listed as 2GB on the download page. If you > tell me where you saw it listed as 2GB I will update that page. > > > - A list of changed features has not been high priority since > I will update the downloads page right now stating the relation extractor > is new for 3.0 > > > > - Where would a newcomer hit first? Internet search for "ctakes" or > > "ctakes 3.0" is probably first. Top hits on those lists should be > > modified to point to the best Apache cTAKES landing page. > > I modified the following pages to have a link to Apache cTAKES home page > > https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5 > https://sourceforge.net/projects/ohnlp/files/cTAKES/ > https://sourceforge.net/projects/ohnlp/ > http://ohnlp.sourceforge.net/ > > The update to the last one is not appearing yet, but it was updated. > > > - The Getting Started page needs to be written in context of all future > > releases not just 3.0. > > Looks like you updated that page, thanks. > > > - The current guides are still not complete. > > I took a quick run through the User Guide on the Wiki and made some > updates. > > -- James Masanz > > > > -----Original Message----- > > From: ctakes-dev-return-1239-Masanz.James=mayo.edu@incubator.apache.org > > [mailto:ctakes-dev-return-1239- > > Masanz.James=mayo.edu@incubator.apache.org] On Behalf Of Bleeker, Troy > > C. > > Sent: Monday, February 18, 2013 10:58 AM > > To: ctakes-dev@incubator.apache.org > > Subject: RE: cTAKES 3.0.0 Feedback Was: RE: [DISCUSS] Graduate cTAKES > > from Incubator > > > > All the suggestions and discussion are good. There's a lot here, sorry > > for the long summary. First things first: > > > > - The downloads page must work. It now seems to function alright > > ***IF*** you refresh the page or select a mirror and click the Change > > button. If you do neither and try to download you get this error: "The > > requested URL /ctakes/[preferred]incubator/ctakes/apache-ctakes-3.0.0- > > incubating-bin.tar.gz was not found on this server." Anyone have time to > > fix this? > > - Adding a link to the install instructions makes it obvious (which I > > have done to the page) but it was there in a sense. The page said "Use > > the Developer and User Guides to direct you through the installation > > process." and the links to those were just to the left in the hierarchy. > > - "Last official release" was held until now. Since 3.0 is going to be > > officially announced, 3.0 will go there. I made that change as well. > > - A list of changed features has not been high priority since the > > original goal was to make a 3.0 in Apache that pretty much matched the > > function of 2.5. The only thing that changed was how the product is > > built and shipped. Nevertheless we need to state at least that > > somewhere. > > - The resources file is 1.1 GB not 2 right? > > > > Keep in mind that the community decided to release cTAKES 3.0 without > > the doc being complete, but these must be next: > > - The current guides are still not complete. I made it through the > > developer guide but the user guide still has problems. I get errors > > after installing and running scripts. I have not been able to try the > > comparison test that was available previously. The table at the bottom I > > have not got to yet. > > - We have 2 sites now 1) cTAKES home http://incubator.apache.org/ctakes/ > > 2) cTAKES doc https://cwiki.apache.org/confluence/display/CTAKES. I've > > done my best to minimize a user going back and forth. We have it this > > way because a useful guide is not easy (IMHO or even possible) with > > markdown text in the cTAKES home pages. There are still 3.0 developer > > and user guide pages on the cTAKES home site that should be removed so > > no one stumbles on to them. > > - The Getting Started page needs to be written in context of all future > > releases not just 3.0. > > - Previous releases list. We need to both point to the NCI sites for 2.6 > > and back plus create an archive for what will be the history of Apache > > releases. Needs work your right. > > > > Potential priorities after that: > > - The examples, as described by Andy, would be more than a readme should > > have. This would be great for a how-to guide. > > - The Developer Guide and User Guide have historically been install > > guides not how-to guides. I don't think a how-to guide should be > > incorporated into these but should be its own document. > > - cTAKES has never had a how-to guide that I know of. Making one would > > be great and as you say should include things like 1) pointers to where > > to find basic information 2) very high level overview of the components > > in the context of using them to do a very basic task like 3) I think it > > was suggested that the Getting Started page might be something like this > > in very short form. If we did that then it would point to a more > > comprehensive how-to guide. > > - Project history page of all cTAKES releases placed on Apache sites > > somewhere. Good plan if short. I would not copy readmes there but have > > links to them. > > I already did this for cTAKES 2.5 and past: > > https://wiki.nci.nih.gov/display/VKC/cTAKES+Roadmap > > Move this page to Apache? Have a page on Apache that continues this and > > points back to what already exists? > > Also, I had this project history on the Apache cTAKES downloads page but > > that section was removed when 3.0 was placed on there. If you can find > > the history of changes to that page you may find something already done > > in markdown format. > > - Creating a single download for a newcomer. This has been discussed and > > tabled by the community for the time being in order to get the 3.0 > > release done and out the door. We need to come back to this in order to > > make the best first impression. > > > > Troy > > > > -----Original Message----- > > From: ctakes-dev-return-1230-Bleeker.Troy=mayo.edu@incubator.apache.org > > [mailto:ctakes-dev-return-1230- > > Bleeker.Troy=mayo.edu@incubator.apache.org] On Behalf Of Chen, Pei > > Sent: Friday, February 15, 2013 10:17 PM > > To: ctakes-dev@incubator.apache.org > > Subject: cTAKES 3.0.0 Feedback Was: RE: [DISCUSS] Graduate cTAKES from > > Incubator > > > > Thanks Andy for the feedback. > > Examples are a good idea- Were you thinking of adding it to the README > > file or the confluence user guide? > > > > Feel free to update the downloads page(s) (it uses Apache CMS) and the > > User Guides (Confluence wiki). > > Note: The release is still being replicated to all of the mirrors and > > may take up to 24 hrs, so I would wait until after the weekend before > > testing all of the mirror links. > > > > --Pei > > ________________________________________ > > From: Andy McMurry [mcmurry.andy@gmail.com] > > Sent: Friday, February 15, 2013 11:08 PM > > To: ctakes-dev@incubator.apache.org > > Subject: Re: [DISCUSS] Graduate cTAKES from Incubator > > > > Clarifications > > > > There isn't a last Apache release. But there are last previous NIH, > > Sourceforge, and Apache releases? > > TODO: Project History Page (Simple, just the releases and times, ideally > > with JIRA generated release notes). > > > > Suggestion: Demonstration > explanation. Use Examples !! > > > > EXAMPLE 1 : Basic Pipeline (without UMLS) > > > > ** SHOW Before and after clinical text, demonstrates purpose > > ** LIST the 5 steps > > ** Most impressive demo would be the smoking status pipeline > > > > EXAMPLE 2: Basic Pipeline (with UMLS) > > > > ** SHOW Before and after (input text -> output annotations) > > ** LIST the steps > > ** Most impressive demo would be a negation of a cancer diagnosis and > > NER of a medication (chemotherapuetic drug). > > > > Thoughts? > > --andy > > > > > > On Feb 15, 2013, at 7:36 PM, Andy McMurry > > wrote: > > > > > Sure thing Pei. > > > > > > I dont think cTAKES is ready for attention grabbing release (humble > > opinion). > > > And when you release you want to grab attention! ! cTAKES is awesome!! > > > > > > Suggestions (release blockers) > > > > > > (1) Downloads > > > http://incubator.apache.org/ctakes/downloads.cgi > > > ! Link to install instructions is not there but "Verifying signatures" > > takes up 20% of the page. NEEDS OBVIOUS LINK TO INSTALL INSTRUCTIONS. > > > ! Last official release is blank because there isn't one, remove it ! > > > First mirror I tried was a 404? (not sure which one). I changed the > > > mirror then OK. Test all mirrors (script) ? Previous releases are VERY > > confusing. > > > ? The NIH and SourceForge pages should redirect to cTAKES, google > > "cTAKES download" and imagine how confused a beginner would be. > > > > > > (2) User Guide > > > http://incubator.apache.org/ctakes/3.0.0/user-guide-3.0 > > > ? 3.0.0 : no list of new features from last stable release. Why would > > a user bother to upgrade to a beta? > > > ! Would be better to have a bundled download with resources, if > > possible. Otherwise, make it clear to a newcomer what the benefit of > > getting UMLS / LVG is. (one sentence). > > > ! Needs a very high level overview of the components in the context of > > using them to do a very basic task like. > > > ! This is likely the most frequently accessed document for cTAKES. It > > has almost no pointers to where to find basic information. > > > > > > (3) OTHER > > > * The NCI and SourceForge links are now highly confusing. > > > * While I am downloading, I should be reading the recommended "Get > > > Started" guide > > > > > > I'm still downloading the 2GB resources file. > > > I'll try and get back to you about the install when that is done too. > > > > > > This constructive criticism is because I believe cTAKES is AWESOME. > > > Hard to see how awesome it is given the current instructions. > > > > > > --Andy > > > > > > > > > On Feb 15, 2013, at 5:02 PM, "Chen, Pei" > > wrote: > > > > > >> Hi Andy, > > >> So much has changed in cTAKES since last year, if you have a chance- > > do you also want to try downloading the -bin and ensure at least the > > steps in the README are able to get you started? > > >> > > >> --Pei > > >> ________________________________________ > > >> From: Andy McMurry [mcmurry.andy@gmail.com] > > >> Sent: Friday, February 15, 2013 4:04 PM > > >> To: ctakes-dev@incubator.apache.org > > >> Subject: Re: [DISCUSS] Graduate cTAKES from Incubator > > >> > > >> Suggestion: can we get a good programmer with no cTAKES experience to > > kick the tires and tell us how long it took to get started? > > >> > > >> John Resig (jQuery founder) once told me "if it takes more than 15 > > minutes to get started, then that is way too long". > > >> > > >> "What is necessary is that enough investment be put into presentation > > >> that newcomers can get past the obstacle of unfamiliarity. ... > > Hactivation energy: the amount of energy a newcomer must put in before > > she starts getting something back" > > >> -- From "Producing Open Source Software" > > >> > > >> http://books.google.com/books?id=0vbr7xvvzjgC&pg=PA21&lpg=PA21&dq=hac > > >> ktivation+energy&source=bl&ots=D0hP85ndwz&sig=G5HO-7GbLqQPwLaI6210D9W > > >> Gk2E&hl=en&sa=X&ei=N6EeUZXVHMHhiALq3YG4BQ&ved=0CDoQ6AEwAQ#v=onepage&q > > >> =hacktivation%20energy&f=false > > >> > > >> > > >> On Feb 15, 2013, at 12:55 PM, "Chen, Pei" > > wrote: > > >> > > >>> This is to open a discussion to graduate Apache cTAKES podling from > > the Apache Incubator. > > >>> > > >>> Apache cTAKES entered the Incubator in June of 2012. We have made > > significant progress with the project since moving over to Apache. We > > currently have 18 committers listed on our status page at [1] including > > over 10 which accepted after the podling was formed. > > >>> > > >>> During incubation, cTAKES has : > > >>> * Produced 1 Release > > >>> * Added 10 new Committer/PPMC members and shows constant community > > >>> activities > > >>> * Cleared IP on code > > >>> * Developed Roadmap(s) for the next major and minor releases in a > > >>> community process and started working on that [2] > > >>> * The community of Apache cTAKES is active, healthy, and growing and > > has demonstrated the ability to self-govern using accepted Apache > > practices. > > >>> > > >>> [1] http://people.apache.org/committers-by-project.html#ctakes > > >>> [2] > > >>> https://issues.apache.org/jira/browse/CTAKES#selectedTab=com.atlassi > > >>> an.jira.plugin.system.project%3Aroadmap-panel > > >>> > > >> > > > > > --bcaec54c50f0c59bfd04d61b9ecc--