crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Barretta <mike.barre...@gmail.com>
Subject Re: ClassNotFoundException: Class org.apache.crunch.impl.mr.run.CrunchMapper
Date Wed, 26 Nov 2014 17:29:57 GMT
Thank you for the quick reply.

I am indeed using the Oozie workflow lib directory as described here:
http://oozie.apache.org/docs/3.3.2/WorkflowFunctionalSpec.html#a7_Workflow_Application_Deployment.


The primary job, which implements Tool, is able to run, it's just the jobs
launched by the doFn() which fail.  Is there a step where I might need to
tell the Crunch pipeline about the jars loaded by Oozie?

On Fri, Nov 21, 2014 at 5:27 PM, Micah Whitacre <mkwhitacre@gmail.com>
wrote:

> The support of a lib folder inside of a jar is not necessarily guaranteed
> to be supported on all versions of Hadoop.[1]
>
> We typically go with the "uber" jar where we use maven-shade-plugin to
> actually explode the crunch dependencies and others into the assembly jar.
> Another approach since you are using Oozie is to include the jar in the
> workflow lib directory.  That should put the jar on the classpath.  The
> last approach is obviously to manually use DistributedCache yourself which
> will distribute it out to the cluster.
>
> [1] -
> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
>
> On Fri, Nov 21, 2014 at 4:15 PM, Mike Barretta <mike.barretta@gmail.com>
> wrote:
>
>> All,
>>
>> I'm running an MRPipeline from crunch-core 0.11.0-hadoop2 on a CDH5.1
>> cluster via oozie.  While the main job runs okay, the doFn() it calls fails
>> due to the CNFE.  The jar containing my classes does indeed contain
>> lib/crunch-core-0.11.0-hadoop2.jar.
>>
>> Does the crunch jar need to be added to the hadoop lib on all nodes?  It
>> seems like that would/should be unnecessary.
>>
>> Thanks,
>> Mike
>>
>
>

Mime
View raw message