crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: Current state of scrunch
Date Sun, 03 Mar 2013 04:45:31 GMT
Hey John,

Not a silly question, nor a beginner one. We have a maven profile in
Scrunch that builds all of the packaging you need to run it from the scala
interpreter, you do it via:

mvn clean package -P scrunch

in the crunch-scrunch directory, and it will create
a crunch-scrunch-<version>-release.tar.gz file in the target/ directory
with all of the scripts and libs setup for you to run from the interpreter
(at least as of 2.9.2). Note that you'll need to specify
-Dcrunch.platform=2 in order to have the build be against the Hadoop 2.x
APIs (e.g., if you're using CDH4).

J



On Sat, Mar 2, 2013 at 3:29 PM, John Jensen <jensen@richrelevance.com>wrote:

>
>  Thanks. I'm just a little seduced by the syntactical simplicity of
> writing in scala, so I figured I'd take a look.
>
>  BTW, (silly beginner question) do you have any pointers on how to run
> scrunch from the scala interpreter.
>
>  If I just try something like:
> ../scala-2.9.3/bin/scala -classpath `hadoop
> classpath`:0.5.0-incubating.jar:crunch-scrunch-0.5.0-incubating.jar:lib/guava-11.0.2.jar:`hadoop
> classpath`:lib/avro-1.7.0.jar:lib/avro-mapred-1.7.0.jar
>
>  scala> val pipeline = Pipeline.mapReduce
>
>  I get
>  java.io.IOException: No FileSystem for scheme: file
> at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2250)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2257)
>  …
>
>  I figure there must be a simpler way.
>
>
>   From: Josh Wills <jwills@cloudera.com>
> Reply-To: "user@crunch.apache.org" <user@crunch.apache.org>
> Date: Saturday, March 2, 2013 1:06 PM
> To: "user@crunch.apache.org" <user@crunch.apache.org>
> Subject: Re: Current state of scrunch
>
>   Hey John,
>
>  I think Scrunch has a good foundation right now, but yes, there is some
> work to do to expose new functionality in the Java APIs. I'd like to spend
> some more time on Scrunch for the next release, so if you come across
> something you need, please let me know and I'll add it straightaway.
>
>  J
>
>
> On Sat, Mar 2, 2013 at 12:43 PM, John Jensen <jensen@richrelevance.com>wrote:
>
>>
>>  Hey,
>>
>>  I am considering taking a closer look at migrating some of our existing
>> cruch code to scala, and I was wondering about the current state of the
>> scrunch code.
>> Is it generally being kept in synch with the mainline crunch development?
>>
>>  I assume most functionality is being delegated to the underlying crunch
>> implementation but presumably there is still work needed as features are
>> added to crunch. No?
>>
>>  -- John
>>
>>
>
>
>  --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
View raw message