Return-Path: X-Original-To: apmail-crunch-user-archive@www.apache.org Delivered-To: apmail-crunch-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B030A10473 for ; Tue, 1 Oct 2013 20:10:59 +0000 (UTC) Received: (qmail 22915 invoked by uid 500); 1 Oct 2013 20:10:59 -0000 Delivered-To: apmail-crunch-user-archive@crunch.apache.org Received: (qmail 22812 invoked by uid 500); 1 Oct 2013 20:10:57 -0000 Mailing-List: contact user-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@crunch.apache.org Delivered-To: mailing list user@crunch.apache.org Received: (qmail 22737 invoked by uid 99); 1 Oct 2013 20:10:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Oct 2013 20:10:56 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jwills@cloudera.com designates 209.85.216.54 as permitted sender) Received: from [209.85.216.54] (HELO mail-qa0-f54.google.com) (209.85.216.54) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Oct 2013 20:10:50 +0000 Received: by mail-qa0-f54.google.com with SMTP id bv4so3702816qab.20 for ; Tue, 01 Oct 2013 13:10:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=EQsMTFTOKS5N14qXhDcTl6DWIz+ciZh+pZW5q6wVMF0=; b=IRhCVs0Pajh15nOXOmRQNse5jLW3viRBkW9wd4DejVv3oXs3xqv2+YmX0UFckvT5nD 3iIX4w1m3frQURK8ltYoIm6ULFqK2Evo2dQ7stWQcTPadM57Nuym80QI7LRAu4RbfVoB ztqNWJ2JHbySajfycqIVWSW2oAC/0NbHUQ13aZmmCqVqObwvqBJZq3za+9rgTWDqIp6A fuSyfvx+wJlTLg2VQWpTjT7kSpu+FcyIW477nvBLLty30F4bfJi5G3v9KxvoNvzEZT3L Q5V6MpZ1cYURRCaCDqUOZ2Ybr9mvIjq0ECRj0v5zuV8mQbw5YZvhdKjbXB7yYDo3QLqz uYOA== X-Gm-Message-State: ALoCoQlYzY18EOsJMqhfzNki9OaSbuBw4GVJ0YFRsG7DxhSuT0QKBz4+9IfNZh3K1pkNOMrIKFV2 X-Received: by 10.224.43.84 with SMTP id v20mr23794019qae.45.1380658229203; Tue, 01 Oct 2013 13:10:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.31.10 with HTTP; Tue, 1 Oct 2013 13:10:08 -0700 (PDT) In-Reply-To: References: From: Josh Wills Date: Tue, 1 Oct 2013 13:10:08 -0700 Message-ID: Subject: Re: Crunch on EMR To: user@crunch.apache.org Content-Type: multipart/alternative; boundary=047d7bdc1b864ee91204e7b38b83 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdc1b864ee91204e7b38b83 Content-Type: text/plain; charset=ISO-8859-1 Hey Som, You should be able to use any of the non-hadoop2 jars for Crunch on EMR, like the regular 0.7.0: http://mvnrepository.com/artifact/org.apache.crunch/crunch-core/0.7.0 Those are compiled against the MR1 APIs, which is why you're getting the TaskInputOutputContext exception (the API changed from MR1 to MR2, which CDH4.3.0 and hadoop2 use.) Josh On Tue, Oct 1, 2013 at 12:00 PM, Som Satpathy wrote: > Hi All, > > I have been trying to run crunch jobs on amazon EMR and faced a problem > while job execution - > > "found class org.apache.hadoop.mapreduce.taskinputoutputcontext but > interface was expected" > > This is happening because of hadoop incompatibilities between APIs used > while implementing the hadoop job, and the hadoop-code that runs in the > cluster. > > My crunch fat jar is based on crunch version 0.7 (CDH 4.3.0) while EMR > runs hadoop 1.0.3 (where TaskInputOutputContext is implemented as an > abstract class) > > Has any one been able to successfully execute their crunch jobs on EMR? > > If yes, what are the best practices to make custom crunch fat jars work on > EMR? > > > Look forward to hearing your thoughts. > > Thanks, > > Som > -- Director of Data Science Cloudera Twitter: @josh_wills --047d7bdc1b864ee91204e7b38b83 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hey Som,

You should be able to use any = of the non-hadoop2 jars for Crunch on EMR, like the regular 0.7.0:


Those are compiled against the MR1 APIs, which is= why you're getting the TaskInputOutputContext exception (the API chang= ed from MR1 to MR2, which CDH4.3.0 and hadoop2 use.)

Josh


On Tue, Oct 1, 2013 at 12:00 PM, Som Satpathy <= ;somsatpathy@gma= il.com> wrote:
Hi All,

= I have been trying to run crunch jobs on amazon EMR and faced a problem whi= le job execution -=A0

"found class org.apache.hadoop.mapreduce.taskinputoutputcontext but= interface was expected"

This is happening because of hadoop incompatibilities between APIs used = while implementing the hadoop job, and the hadoop-code that runs in the clu= ster.

My crunch fat jar is based on crunch version 0.7 (CDH 4.3.0) wh= ile EMR runs hadoop 1.0.3 (where TaskInputOutputContext is implemented as a= n abstract class)

Has any one been able to successfully execute their crunch jobs on EMR?<= /p>

If yes, what are the best practices to make custom crunch fat jars wo= rk on EMR?


Look forward to hearing your thoughts.

Thanks,

Som




--
Directo= r of Data Science
Twitter: @josh_wills
--047d7bdc1b864ee91204e7b38b83--