crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Barretta <mike.barre...@gmail.com>
Subject Re: ClassNotFoundException: Class org.apache.crunch.impl.mr.run.CrunchMapper
Date Tue, 02 Dec 2014 22:07:13 GMT
FWIW, I solved this by manually adding all necessary jars into the
DistributedCache...ugly, but effective!

On Wed, Nov 26, 2014 at 12:29 PM, Mike Barretta <mike.barretta@gmail.com>
wrote:

> Thank you for the quick reply.
>
> I am indeed using the Oozie workflow lib directory as described here:
> http://oozie.apache.org/docs/3.3.2/WorkflowFunctionalSpec.html#a7_Workflow_Application_Deployment.
>
>
> The primary job, which implements Tool, is able to run, it's just the jobs
> launched by the doFn() which fail.  Is there a step where I might need to
> tell the Crunch pipeline about the jars loaded by Oozie?
>
> On Fri, Nov 21, 2014 at 5:27 PM, Micah Whitacre <mkwhitacre@gmail.com>
> wrote:
>
>> The support of a lib folder inside of a jar is not necessarily guaranteed
>> to be supported on all versions of Hadoop.[1]
>>
>> We typically go with the "uber" jar where we use maven-shade-plugin to
>> actually explode the crunch dependencies and others into the assembly jar.
>> Another approach since you are using Oozie is to include the jar in the
>> workflow lib directory.  That should put the jar on the classpath.  The
>> last approach is obviously to manually use DistributedCache yourself which
>> will distribute it out to the cluster.
>>
>> [1] -
>> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
>>
>> On Fri, Nov 21, 2014 at 4:15 PM, Mike Barretta <mike.barretta@gmail.com>
>> wrote:
>>
>>> All,
>>>
>>> I'm running an MRPipeline from crunch-core 0.11.0-hadoop2 on a CDH5.1
>>> cluster via oozie.  While the main job runs okay, the doFn() it calls fails
>>> due to the CNFE.  The jar containing my classes does indeed contain
>>> lib/crunch-core-0.11.0-hadoop2.jar.
>>>
>>> Does the crunch jar need to be added to the hadoop lib on all nodes?  It
>>> seems like that would/should be unnecessary.
>>>
>>> Thanks,
>>> Mike
>>>
>>
>>
>

Mime
View raw message