crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject refactoring crunch-archetype
Date Mon, 11 Mar 2013 21:01:14 GMT
Hey Matthias,

I cc'd everyone else on here, but since this was your module, I thought it
best to solicit your opinion before refactoring it.

We never managed to get crunch-archetypes working w/hadoop 2.x, which is
apparently deprecating the lib/* trick for including client dependencies in
favor of the -libjars option (see
http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/and
http://architects.dzone.com/articles/using-libjars-option-hadoop )

The way that I have found to do this in Maven is to use the
copy-dependencies option of the maven-dependency-plugin and include a shell
script in a bin/ directory that knows how to setup the HADOOP_CLASSPATH and
libjars arguments for use with hadoop jar. Although this approach is more
complex than the lib/* trick, it will be able to support hadoop 1.x as well
as hadoop 2.x.

Do you have any objections to me taking this on, and/or any other landmines
I should keep an eye out for?

Thanks!
Josh

-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message