crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: Scrunch example project with SBT?
Date Fri, 20 Jun 2014 20:34:32 GMT
You need to manually call run() or done() to execute the pipeline if you're
not materializing the output. The user guide will be useful for the basic
concepts, even though it focuses on the Java API.
On Jun 20, 2014 1:27 PM, "Daniel Siegmann" <daniel.siegmann@velos.io> wrote:

> Thanks Josh! The thrift and protobuf defs were what I was missing. I'm
> able to compile and run the code now. I also updated to Scrunch 0.10.0.
>
> Any idea why it might not write the output? If I have
>
> countWords(args(0)).materialize.foreach(line => println(s"**** $line"))
>
> I get all my output, but
>
> countWords(args(0)).write(to.textFile(args(1)))
>
> Doesn't even create the output directory, even though I see this in my logs
>
> 14/06/20 16:17:47 INFO impl.FileTargetImpl: Will write output files to new
> path:
> /var/folders/th/7vf9rjqd1955jnwnzg3x9ym40000gn/T/1403295466563-1/wordcounts
>
> No exceptions or anything. I'm probably missing something obvious. :-(
>
>
> On Thu, Jun 19, 2014 at 6:03 PM, Josh Wills <jwills@cloudera.com> wrote:
>
>> Here you go: https://github.com/jwills/scrunch-demo
>>
>> Did this w/Maven; you'll have to forgive me as my SBT-fu isn't great. It
>> looks like vanilla Hadoop 1.x doesn't include any thrift/protobuf
>> dependencies that Scrunch expects to be present at compile-time; I added
>> them as provided dependencies in this example and then verified that I
>> could run the -job.jar that I built w/mvn package under Hadoop 1.0.3.
>>
>> J
>>
>>
>> On Thu, Jun 19, 2014 at 2:33 PM, Daniel Siegmann <
>> daniel.siegmann@velos.io> wrote:
>>
>>> Hi Josh, thanks for the reply.
>>>
>>>  Which version of Hadoop are you looking to compile against?
>>>>
>>>
>>> I think any 1.x version will suffice (our production cluster is MapR).
>>>
>>> The Spotify comparison is interesting. Too bad they didn't evaluate
>>> Scoobi as well. Thanks for the info.
>>>
>>
>>
>>
>> --
>> Director of Data Science
>> Cloudera <http://www.cloudera.com>
>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>
>
>
>
> --
> Daniel Siegmann, Software Developer
> Velos
> Accelerating Machine Learning
>
> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
> E: daniel.siegmann@velos.io W: www.velos.io
>

Mime
View raw message