ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: scala and groovy
Date Fri, 13 Dec 2013 22:19:19 GMT
I can understand your reservations. However, they appear to be similar to the reservations
that some people have against using Maven (which also automatically downloads stuff, although
is for developers) or using web-services (e.g. the UMLS service used by cTAKES).

A Groovy script is certainly no replacement for a full download, for all the reasons that
you are describing. I think it can be a supplement for those who do not want to start out
with the full download.

It may be possible to combine both approaches, though. E.g. use the same script in a scenario
which does auto-downloading and in a scenario where the user has downloaded a distribution.
In the second case, the distribution would have to come with proper configuration files to
point the artifact resolution mechanism at the folders to which the distribution has been
downloaded. It sounds reasonable, but it is probably much less straight forward then it sounds.
But eventually, that is part of the idea, that you can trade convenience (auto-downloads)
for control (pre-downloaded artifacts).

I believe, the script approach also shows where resource handling could be improved, e.g.
by distributing certain resources as Maven artifacts and/or incorporating the ability of automatically
downloading resources directly in analysis engines. IMHO, there shouldn't be any code which
explicitly downloads resources.

In DKPro Core, we support both. If a resources is available on the classpath (e.g. by virtue
of being a Maven dependency, by being referred to by a @Grab, or by having been downloaded
as part of a distribution), it is used form there. Otherwise, our AEs try to automatically
download the resource from our Maven repository (unless this is explicitly disabled).

In my experience, using technologies like Maven or Grapes in a corporate environment should
be supplemented by a private artifact repository run by the corporation, e.g. to reduce network
issues when talking to external repositories, or to distribute proprietary artifacts (resources,
analysis components, or other libraries). Corporate users should then use this repository
as a general proxy to access any artifacts. E.g. at the UKP Lab, we run such an internal repository.
All our users get all artifacts through there - it caches everything anybody ever used, so
we can even continue to use artifacts should the remote repository be down temporarily, permanently,
or if artifacts got deleted. We trust in the Maven infrastructure, but we like to have control
over the artifacts.

Some stuff, like the Groovy scripts, we do only as a service to newbies or for doing small
things, e.g. simple conversion pipelines. They are a result of trying to provide some usable
examples for people that have reservations against installing Eclipse, setting up Maven, etc.
And they appear to be less intimidating then Java to people who know e.g. Python, because
they are directly executable and quite readable. 

I'm not perfectly happy with them, because there is still stuff that is too technical, e.g.
all the import statements. Eventually, a similar technology would be nice which only consists
of the pipeline declaration (no @Grabs, no imports), but still functions in the same way (including
auto-downloads). But, that is - just as the pre-deploy scenario - future work ;)

Anyway, I would also like to thank you for experimenting with the idea and testing its implications
in a corporate environment! 

-- Richard

On 13.12.2013, at 19:46, "Masanz, James J." <Masanz.James@mayo.edu> wrote:

> 
> Thanks Richard for doing all that testing.
> 
> But the idea that we cannot easily get to what is causing the issue, together with the
fact Tim was able to reproduce one of my issues [1], leads me to question using dynamic downloading
of anything for our users.
> 
> I would prefer to see a single download that a user extracts from, which I see having
the following advantages
> - no mysterious suspected network issues
> - user can be told how much space will be taken up
> - user has easy control where things will be put (rather than having to configure where
grapes will be stored, if user does not want them under their home directory)
> 
> That's my 2 cents.
> 
> Yes, I am behind a firewall. And in fact I am VPN'd in to work. But I suspect some of
our users do that too.
> 
> [1] http://markmail.org/message/lgo7eyruotl7nnix
> 
> -- James
> 
> 
> -----Original Message-----
> From: dev-return-2322-Masanz.James=mayo.edu@ctakes.apache.org [mailto:dev-return-2322-Masanz.James=mayo.edu@ctakes.apache.org]
On Behalf Of Richard Eckart de Castilho
> Sent: Friday, December 13, 2013 3:36 PM
> To: dev@ctakes.apache.org
> Subject: Re: scala and groovy
> 
> Hi James,
> 
> I enabled info on the grape resolving using
> 
>  export JAVA_OPTS="-Dgroovy.grape.report.downloads=true $JAVA_OPTS"
> 
> Then I tried your script three times. 
> 
> 1) First, I just ran without any changes to my system (custom grapeConfig.xml which avoids
using .m2/repository, no flush of ~/groovy/grapes). It downloaded some missing artifacts and
printed the message.
> 
> 2) Then I deleted my ~/.groovy/grapes folder and tried again. It downloaded all artifacts
and printed the message.
> 
> 3) Then - just to make sure - I removed my customized grapeConfig.xml. Then I deleted
my ~/.m2/repository and ~/.groovy/grapes again. It downloaded all artifacts and printed the
message. It couldn't be a cleaner test then this one, I suppose.
> 
> So here is the output full of the third run:
> 
> $ ./blah
> Resolving dependency: org.cleartk#cleartk-util;0.9.2 {default=[default]}
> Preparing to download artifact org.cleartk#cleartk-util;0.9.2!cleartk-util.jar
> Preparing to download artifact org.apache.uima#uimaj-core;2.4.0!uimaj-core.jar
> Preparing to download artifact org.uimafit#uimafit;1.4.0!uimafit.jar
> Preparing to download artifact args4j#args4j;2.0.16!args4j.jar
> Preparing to download artifact com.google.guava#guava;13.0!guava.jar
> Preparing to download artifact com.carrotsearch#hppc;0.4.1!hppc.jar
> Preparing to download artifact commons-io#commons-io;2.4!commons-io.jar
> Preparing to download artifact commons-lang#commons-lang;2.4!commons-lang.jar
> Preparing to download artifact org.apache.uima#uimaj-tools;2.4.0!uimaj-tools.jar
> Preparing to download artifact org.springframework#spring-core;3.1.0.RELEASE!spring-core.jar
> Preparing to download artifact org.springframework#spring-context;3.1.0.RELEASE!spring-context.jar
> Preparing to download artifact org.apache.uima#uimaj-cpe;2.4.0!uimaj-cpe.jar
> Preparing to download artifact org.apache.uima#uimaj-document-annotation;2.4.0!uimaj-document-annotation.jar
> Preparing to download artifact org.apache.uima#uimaj-adapter-vinci;2.4.0!uimaj-adapter-vinci.jar
> Preparing to download artifact org.apache.uima#jVinci;2.4.0!jVinci.jar
> Preparing to download artifact org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar
> Preparing to download artifact commons-logging#commons-logging;1.1.1!commons-logging.jar
> Preparing to download artifact org.springframework#spring-aop;3.1.0.RELEASE!spring-aop.jar
> Preparing to download artifact org.springframework#spring-beans;3.1.0.RELEASE!spring-beans.jar
> Preparing to download artifact org.springframework#spring-expression;3.1.0.RELEASE!spring-expression.jar
> Preparing to download artifact aopalliance#aopalliance;1.0!aopalliance.jar
> Downloaded 8478 Kbytes in 44860ms:
>  [SUCCESSFUL ] org.cleartk#cleartk-util;0.9.2!cleartk-util.jar (1385ms)
>  [SUCCESSFUL ] org.apache.uima#uimaj-core;2.4.0!uimaj-core.jar (5326ms)
>  [SUCCESSFUL ] org.uimafit#uimafit;1.4.0!uimafit.jar (1553ms)
>  [SUCCESSFUL ] commons-lang#commons-lang;2.4!commons-lang.jar (2357ms)
>  [SUCCESSFUL ] org.apache.uima#uimaj-tools;2.4.0!uimaj-tools.jar (2097ms)
>  [SUCCESSFUL ] org.apache.uima#uimaj-cpe;2.4.0!uimaj-cpe.jar (1625ms)
>  [SUCCESSFUL ] org.apache.uima#uimaj-adapter-vinci;2.4.0!uimaj-adapter-vinci.jar (1035ms)
>  [SUCCESSFUL ] org.apache.uima#jVinci;2.4.0!jVinci.jar (1475ms)
>  [SUCCESSFUL ] org.apache.uima#uimaj-document-annotation;2.4.0!uimaj-document-annotation.jar
(715ms)
>  [SUCCESSFUL ] org.springframework#spring-core;3.1.0.RELEASE!spring-core.jar (1992ms)
>  [SUCCESSFUL ] org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar (1044ms)
>  [SUCCESSFUL ] commons-logging#commons-logging;1.1.1!commons-logging.jar (904ms)
>  [SUCCESSFUL ] org.springframework#spring-context;3.1.0.RELEASE!spring-context.jar (2643ms)
>  [SUCCESSFUL ] org.springframework#spring-aop;3.1.0.RELEASE!spring-aop.jar (1988ms)
>  [SUCCESSFUL ] aopalliance#aopalliance;1.0!aopalliance.jar (798ms)
>  [SUCCESSFUL ] org.springframework#spring-beans;3.1.0.RELEASE!spring-beans.jar (2178ms)
>  [SUCCESSFUL ] org.springframework#spring-expression;3.1.0.RELEASE!spring-expression.jar
(4688ms)
>  [SUCCESSFUL ] args4j#args4j;2.0.16!args4j.jar (1289ms)
>  [SUCCESSFUL ] com.google.guava#guava;13.0!guava.jar (4265ms)
>  [SUCCESSFUL ] com.carrotsearch#hppc;0.4.1!hppc.jar (4001ms)
>  [SUCCESSFUL ] commons-io#commons-io;2.4!commons-io.jar (1444ms)
> Hello World with @Grab annotations
> 
> I cannot help but believe that there is something that is messing up with your network
connections. Packet drops? Firewall with virus filter? I have no idea what.
> 
> -- Richard
> 
> On 13.12.2013, at 19:14, "Masanz, James J." <Masanz.James@mayo.edu> wrote:
> 
>> My experience this week with groovy and grapes has been one of frustration.
>> 
>> Having an issue with  download failed: org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar
>> 
>> So I pared things down to a simple script of four lines:
>> 
>> #!/usr/bin/env groovy
>> @Grab(group='org.cleartk', module='cleartk-util', version='0.9.2')
>> import java.io.File;
>> System.out.println("Hello World with @Grab annotations");
>> 
>> And those four lines still result in the following:
>> 
>> Resolving dependency: org.cleartk#cleartk-util;0.9.2 {default=[default]}
>> Preparing to download artifact org.cleartk#cleartk-util;0.9.2!cleartk-util.jar
>> Preparing to download artifact org.apache.uima#uimaj-core;2.4.0!uimaj-core.jar
>> Preparing to download artifact org.uimafit#uimafit;1.4.0!uimafit.jar
>> Preparing to download artifact args4j#args4j;2.0.16!args4j.jar
>> Preparing to download artifact com.google.guava#guava;13.0!guava.jar
>> Preparing to download artifact com.carrotsearch#hppc;0.4.1!hppc.jar
>> Preparing to download artifact commons-io#commons-io;2.4!commons-io.jar
>> Preparing to download artifact commons-lang#commons-lang;2.4!commons-lang.jar
>> Preparing to download artifact org.apache.uima#uimaj-tools;2.4.0!uimaj-tools.jar
>> Preparing to download artifact org.springframework#spring-core;3.1.0.RELEASE!spring-core.jar
>> Preparing to download artifact org.springframework#spring-context;3.1.0.RELEASE!spring-context.jar
>> Preparing to download artifact org.apache.uima#uimaj-cpe;2.4.0!uimaj-cpe.jar
>> Preparing to download artifact org.apache.uima#uimaj-document-annotation;2.4.0!uimaj-document-annotation.jar
>> Preparing to download artifact org.apache.uima#uimaj-adapter-vinci;2.4.0!uimaj-adapter-vinci.jar
>> Preparing to download artifact org.apache.uima#jVinci;2.4.0!jVinci.jar
>> Preparing to download artifact org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar
>> Preparing to download artifact commons-logging#commons-logging;1.1.1!commons-logging.jar
>> Preparing to download artifact org.springframework#spring-aop;3.1.0.RELEASE!spring-aop.jar
>> Preparing to download artifact org.springframework#spring-beans;3.1.0.RELEASE!spring-beans.jar
>> Preparing to download artifact org.springframework#spring-expression;3.1.0.RELEASE!spring-expression.jar
>> Preparing to download artifact aopalliance#aopalliance;1.0!aopalliance.jar
>> org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
>> General error during conversion: Error grabbing Grapes -- [download failed: org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar]
>> 
>> java.lang.RuntimeException: Error grabbing Grapes -- [download failed: org.springframework#spring-asm;3.1.0.RELEASE!spring-asm.jar]
>> 
>> 
>> I tried deleting .groovy/grapes/org.springframework  but get the same error
>> I don't see this as being friendly for new users if downloading dependencies is not
so simple.
>> 
>> -----Original Message-----
>> From: dev-return-2317-Masanz.James=mayo.edu@ctakes.apache.org [mailto:dev-return-2317-Masanz.James=mayo.edu@ctakes.apache.org]
On Behalf Of Richard Eckart de Castilho
>> Sent: Friday, December 13, 2013 12:16 PM
>> To: dev@ctakes.apache.org
>> Subject: Re: scala and groovy
>> 
>> On 13.12.2013, at 15:27, Steven Bethard <steven.bethard@gmail.com> wrote:
>> 
>>> P.S. I've stayed out of this whole Groovy thing because we (at
>>> ClearTK) had some bad experiences with Groovy in the past. Mainly with
>>> Groovy scripts getting out of sync with the rest of the code base,
>>> just like XML descriptors, though perhaps the IDEs and Maven are
>>> better now and that's no longer a problem? But this whole "grape"
>>> thing instead of standard Maven isn't changing my mind. Not that I
>>> planned to switch away from Scala for my scripting anyway, but...
>> 
>> 
>> I heard and read about your bad experiences with Groovy. I believe
>> that the IDEs got somewhat better at handling Groovy. However, I think a
>> difference needs to be made depending on the use case.
>> 
>> Some people use the XML files as a format to exchange pipelines
>> with each other. However, alone, these files are not of much use.
>> One benefit of using Groovy as a pipeline-exchange format is, that
>> it can actually get all its dependencies itself via Grape. The
>> Groovy script is quite self-contained (although it relies on the
>> Maven infrastructure for downloading its dependencies).
>> Another is, that thanks to uimaFIT, the Groovy code is much less
>> verbose than the XML descriptors.
>> 
>> At the UKP Lab, we also use Groovy sometimes for high-level experiment
>> logic. For us, it is a good compromise between inflexible and
>> verbose XML files and flexible and verbose Java code. Groovy is flexible
>> and concise and the IDE support is meanwhile reasonable.
>> 
>> Mind that the IDE support for Grapes (at least in Eclipse) is hilarious.
>> Grapes cause the IDE to become quite unresponsive as the artifact resolution
>> is now well integrated into the IDE.
>> 
>> So here is my summarized opinion when to use or not to use Groovy:
>> 
>> == Examples / Exchange ==
>> 
>> In order to get quick results for new users and to showcase the capabilities
>> of a component collection such as DKPro Core or cTAKES, I think the Groovy scripts
>> are a convenient vehicle. At DKPro Core, we also packaged all the resources (models)
>> as Maven artifacts, which gives us an additional edge over the manual downloading
>> currently happening in the cTAKES Groovy prototypes.
>> 
>> == High-level experiment orchestration ==
>> 
>> Groovy can be useful for high-level experiment coordination. We mostly use it
>> to conveniently set up parameter spaces and high-level tasks in DKPro Lab [1]
>> and DKPro Text Classification [2] to do parameter sweeping experiments. In
>> particular the closures are helpful here and the shorthand for setting up maps, lists,
etc.
>> 
>> == Reusable code and components ==
>> 
>> I would not recommend Groovy for lower-level code, e.g. for writing framework-level
>> code such as reusable analysis engines or library code. Mind, the IDE support got
>> better, but is is not perfect. At the lower levels, one definitely wants to have
>> strict type checking and a picky compiler.
>> 
>> Cheers,
>> 
>> -- Richard
>> 
>> [1] https://code.google.com/p/dkpro-lab/
>> [2] http://code.google.com/p/dkpro-tc/
> 


Mime
View raw message