Return-Path: X-Original-To: apmail-crunch-user-archive@www.apache.org Delivered-To: apmail-crunch-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F29E111395 for ; Fri, 20 Jun 2014 20:27:29 +0000 (UTC) Received: (qmail 45520 invoked by uid 500); 20 Jun 2014 20:27:29 -0000 Delivered-To: apmail-crunch-user-archive@crunch.apache.org Received: (qmail 45487 invoked by uid 500); 20 Jun 2014 20:27:29 -0000 Mailing-List: contact user-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@crunch.apache.org Delivered-To: mailing list user@crunch.apache.org Received: (qmail 45477 invoked by uid 99); 20 Jun 2014 20:27:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jun 2014 20:27:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of daniel.siegmann@sociocast.com designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jun 2014 20:27:25 +0000 Received: by mail-wi0-f177.google.com with SMTP id r20so1402859wiv.10 for ; Fri, 20 Jun 2014 13:27:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=4U6ANFQMVWLFpxDgCxAK/B+Gc9iBX7VR7iR2SiWaYuE=; b=M3QcxJipy9SJ778H2UGiyoHzSNLSjFO+A5vyjoXe05F2O58BdO+KmFYVQkQteHz+Nw HXk/No0Q9M1536+xPWIZDyY7A54VyuYu3j/r5wsZG8LJFpDwGrpSOYx8lxGbxTFy55+f AfRqay0pA63CuvvOe2DdGkRc/iKAjR7sV53Mv+u8g7cdbBl5sp04uRTs2tKSUZKfU0+l l8FJ1mGBa+xWQZT7h21b5gGlXcVX7bWf8lR0lu8GA9MIMoXBtrR4+8Bq73avDz1sm8lv a6fYEu8TT0KmufnO1/PgH4kEY64dEojqFF8EIftnTF4lo+Sn1gWLuldVB2jmTLn8HA9O K48g== X-Gm-Message-State: ALoCoQkkZRIAprCK/P1KxV6chMu8s5t+2nWxZtIQYh1uQYoI8QDq7tlrDsTDZgwZNmTdJiexe3S0 MIME-Version: 1.0 X-Received: by 10.181.13.80 with SMTP id ew16mr6698504wid.51.1403296024630; Fri, 20 Jun 2014 13:27:04 -0700 (PDT) Received: by 10.194.46.4 with HTTP; Fri, 20 Jun 2014 13:27:04 -0700 (PDT) In-Reply-To: References: Date: Fri, 20 Jun 2014 16:27:04 -0400 Message-ID: Subject: Re: Scrunch example project with SBT? From: Daniel Siegmann To: user@crunch.apache.org Content-Type: multipart/alternative; boundary=f46d04388dfb102e4f04fc4a51b4 X-Virus-Checked: Checked by ClamAV on apache.org --f46d04388dfb102e4f04fc4a51b4 Content-Type: text/plain; charset=UTF-8 Thanks Josh! The thrift and protobuf defs were what I was missing. I'm able to compile and run the code now. I also updated to Scrunch 0.10.0. Any idea why it might not write the output? If I have countWords(args(0)).materialize.foreach(line => println(s"**** $line")) I get all my output, but countWords(args(0)).write(to.textFile(args(1))) Doesn't even create the output directory, even though I see this in my logs 14/06/20 16:17:47 INFO impl.FileTargetImpl: Will write output files to new path: /var/folders/th/7vf9rjqd1955jnwnzg3x9ym40000gn/T/1403295466563-1/wordcounts No exceptions or anything. I'm probably missing something obvious. :-( On Thu, Jun 19, 2014 at 6:03 PM, Josh Wills wrote: > Here you go: https://github.com/jwills/scrunch-demo > > Did this w/Maven; you'll have to forgive me as my SBT-fu isn't great. It > looks like vanilla Hadoop 1.x doesn't include any thrift/protobuf > dependencies that Scrunch expects to be present at compile-time; I added > them as provided dependencies in this example and then verified that I > could run the -job.jar that I built w/mvn package under Hadoop 1.0.3. > > J > > > On Thu, Jun 19, 2014 at 2:33 PM, Daniel Siegmann > wrote: > >> Hi Josh, thanks for the reply. >> >> Which version of Hadoop are you looking to compile against? >>> >> >> I think any 1.x version will suffice (our production cluster is MapR). >> >> The Spotify comparison is interesting. Too bad they didn't evaluate >> Scoobi as well. Thanks for the info. >> > > > > -- > Director of Data Science > Cloudera > Twitter: @josh_wills > -- Daniel Siegmann, Software Developer Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 E: daniel.siegmann@velos.io W: www.velos.io --f46d04388dfb102e4f04fc4a51b4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Josh! The thrift and protobuf d= efs were what I was missing. I'm able to compile and run the code now. = I also updated to Scrunch 0.10.0.

Any idea why it might not wr= ite the output? If I have

countWords(args(0)).m= aterialize.foreach(line =3D> println(s"**** $line"))

I get all my output, but

countWords(args(0)).write(to.textFile(args(1)))
Doesn't even create the output directory, even though I see t= his in my logs

14/= 06/20 16:17:47 INFO impl.FileTargetImpl: Will write output files to new pat= h: /var/folders/th/7vf9rjqd1955jnwnzg3x9ym40000gn/T/1403295466563-1/wordcou= nts


No exceptions or anything. I'm probably missing something obv= ious. :-(


On Thu, Jun 19, 2014 at 6:03 PM, Josh Wills <jwills@cloudera.com> wrote:
Here you go:=C2=A0https://githu= b.com/jwills/scrunch-demo

Did this w/Maven; you'll have to forgive me as my SBT-fu isn't = great. It looks like vanilla Hadoop 1.x doesn't include any thrift/prot= obuf dependencies that Scrunch expects to be present at compile-time; I add= ed them as provided dependencies in this example and then verified that I c= ould run the -job.jar that I built w/mvn package under Hadoop 1.0.3.

J
<= div>


On Thu, Jun 19, 20= 14 at 2:33 PM, Daniel Siegmann <daniel.siegmann@velos.io> wrote:
Hi Josh, thanks for the rep= ly.

Which version of Hadoop are you looking to compile against?

I think any 1.x version will suffice (ou= r production cluster is MapR).

The Spotify comparison is = interesting. Too bad they didn't evaluate Scoobi as well. Thanks for th= e info.



--
Director of Data Science
Twitter: @josh_wills



--
<= span>Daniel Siegmann, Softw= are Developer
Velos
Accelerating Machine = Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegmann@velos.io W: www.velos.io
--f46d04388dfb102e4f04fc4a51b4--