crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Friedrich <m...@mafr.de>
Subject Re: refactoring crunch-archetype
Date Tue, 12 Mar 2013 08:33:30 GMT
Hi,

sure, feel free to take this on. The tricky thing is to make sure that
the generated project has correct dependencies for both Hadoop 1 and 2.

Last time I tried this (and failed due to bugs in the archetype plugin),
I used Velocity templates and introduced a new archetype variable so
that the user could select if he's creating a Hadoop 1 or 2 project.
Maybe you get it working, there has since been a new release of the
archetype plugin.

Shout if you need any help.

Regards,
  Matthias

On Monday, 2013-03-11, Josh Wills wrote:
> I cc'd everyone else on here, but since this was your module, I thought it
> best to solicit your opinion before refactoring it.
> 
> We never managed to get crunch-archetypes working w/hadoop 2.x, which is
> apparently deprecating the lib/* trick for including client dependencies in
> favor of the -libjars option (see
> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/and
> http://architects.dzone.com/articles/using-libjars-option-hadoop )
> 
> The way that I have found to do this in Maven is to use the
> copy-dependencies option of the maven-dependency-plugin and include a shell
> script in a bin/ directory that knows how to setup the HADOOP_CLASSPATH and
> libjars arguments for use with hadoop jar. Although this approach is more
> complex than the lib/* trick, it will be able to support hadoop 1.x as well
> as hadoop 2.x.
> 
> Do you have any objections to me taking this on, and/or any other landmines
> I should keep an eye out for?
> 
> Thanks!
> Josh
> 
> -- 
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
View raw message