incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: crackle status
Date Wed, 05 Dec 2012 07:11:41 GMT
Hey Victor--

Silly question: is there a specific version of gradle I should use? I
downloaded 1.3 and ran gradle build inside of crackle-core, and I got an
exception on the compileClojure step:

Caused by: java.lang.NoSuchMethodError:
org.gradle.util.GUtil.join(Ljava/util/Collection;Ljava/lang/String;)Ljava/lang/String;

Sorry for the sporadic feedback; I'm trying to play with this whenever I
can find a free moment.

Thanks!
Josh


On Sun, Dec 2, 2012 at 5:33 AM, Victor Iacoban <victor.iacoban@gmail.com>wrote:

> Hi,
>
> I've change the way crackle users will be defining remote functions, this
> allowed me to for nicer and more fluent dsl
> I still did not get rid of crunch types influence in crackle dsl but at
> least now it's harder to get it wrong and easier to debug
>
> I think this is pretty close to what I'm trying to achieve. Next steps are
> going to be in the direction of documentation, more validation, error
> condition checks, tests and crackle hbase.
>
> here it is, any comments are welcome:
>
> (ns crackle.example
>   (:require [crackle.from :as from])
>   (:require [crackle.to :as to])
>   (:use crackle.core))
> ;====== word count example ===============(fn-mapcat split-words [line]
> :strings
>   (clojure.string/split line #"\s+"))
> (defn count-words [input-path output-path]
>   (pipeline (from/text-file input-path)
>     (split-words)
>     (count-values)
>     (to/text-file output-path)))
> ;====== average bytes by ip example ======(fn-map parse-line [line]
> [:strings :clojure]
>   (let [parts (clojure.string/split line #"\s+")]
>     (pair-of (first parts) [(read-string (second parts)) 1])))
> (fn-combine sum-bytes-and-counts [value1 value2]
>   [(+ (first value1) (first value2)) (+ (second value1) (second value2))])
> (fn-mapv compute-average [value] :ints
>   (int (apply / value)))
> (defn count-bytes-by-ip [input-path output-path]
>   (pipeline (from/text-file input-path)
>     (parse-line)
>     (group-by-key)
>     (sum-bytes-and-counts)
>     (compute-average)
>     (to/text-file output-path)))
>
> -- victor
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message