Return-Path: X-Original-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B204DFBD for ; Thu, 15 Nov 2012 13:04:37 +0000 (UTC) Received: (qmail 34492 invoked by uid 500); 15 Nov 2012 13:04:36 -0000 Delivered-To: apmail-incubator-crunch-dev-archive@incubator.apache.org Received: (qmail 34057 invoked by uid 500); 15 Nov 2012 13:04:31 -0000 Mailing-List: contact crunch-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: crunch-dev@incubator.apache.org Delivered-To: mailing list crunch-dev@incubator.apache.org Received: (qmail 33981 invoked by uid 99); 15 Nov 2012 13:04:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Nov 2012 13:04:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of victor.iacoban@gmail.com designates 209.85.220.175 as permitted sender) Received: from [209.85.220.175] (HELO mail-vc0-f175.google.com) (209.85.220.175) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Nov 2012 13:04:20 +0000 Received: by mail-vc0-f175.google.com with SMTP id p1so1617979vcq.6 for ; Thu, 15 Nov 2012 05:04:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=IyrpVpKEtQOOONX+ApUGSqEF4hq5N6eksWaRZOKWqJo=; b=EnCKE6P8w/oxa+Ae5WnBQSJLlZ9PY14i5nu6VfVFmGwHA3qhf4oFC0lQoSubCzlt8I xuAxK3rWADAu8/m1Sn5FtxUfHgIT4a/DjdztrkMb3VWs8eP4r5t2NkkPzGWTqhPbgOP5 Vw3Pz/Q6R2BsDK7vsW8HkDC+5nba2KKOZHTkytK8kP16EKewWIoPPWIwwk+YBZKTSWZf tb56Wsk0oxGAJQeXUVi+HCt7AVjI1bb1wMKlw30jbzZh705KB8n+NK7WaImMsKovf6hS wbkhpBhg5aqZnqg0MlWk5DKAuBPiCNsCSfyyE0D02LHYH3nJQQdydhpY87Kl5+NH3szI fdfw== MIME-Version: 1.0 Received: by 10.58.168.135 with SMTP id zw7mr1274180veb.4.1352984640148; Thu, 15 Nov 2012 05:04:00 -0800 (PST) Received: by 10.58.77.144 with HTTP; Thu, 15 Nov 2012 05:04:00 -0800 (PST) In-Reply-To: References: Date: Thu, 15 Nov 2012 08:04:00 -0500 Message-ID: Subject: Re: extending crunch From: Victor Iacoban To: crunch-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=047d7b678494dcd43104ce88480b X-Virus-Checked: Checked by ClamAV on apache.org --047d7b678494dcd43104ce88480b Content-Type: text/plain; charset=UTF-8 Thanks Josh, will give this a try On Wed, Nov 14, 2012 at 9:54 PM, Josh Wills wrote: > I'm always glad to help people to extend Crunch in ways that are useful for > them. I think that most things that involve type-related extensions can be > handled using the PTypes.derived() function, which can be used to create > custom PTypes that are mapped to underlying serialized types, so that you > could do something like > > // Forgive my syntax errors, I'm doing this w/o an IDE > PType objectType = PTypes.derived(Object.class, new > InputMapFn(), new OutputMapFn BytesWritable>(), Writables.writables(BytesWritable.class)); > > ...which is essentially how Scrunch works: the PTypes { } functionality in > Scrunch maps from Scala types to Java types using the derived > functionality. > > The Converter stuff is internal to Avro and Writable, I can't think of a > case where that would need to be exposed outside the package (i.e., once > you've decided on whether to use Writables or Avro as your serialization > framework, the choice of Converter is fixed.) > > If you have a use case where the derived type can't handle the conversion > or is a poor choice for whatever reason, I'm all about having a discussion > and trying out different designs. > > Josh > > > On Wed, Nov 14, 2012 at 6:18 PM, Victor Iacoban >wrote: > > > Hi, > > > > I'm very interested in writing a wrapper library around Apache Crunch for > > Clojure, something similar to existing Scrunch. > > How do you recommend to start? > > > > I was looking through Crunch code and it looks like I can pretty easily > > integrate it in clojure by adding some custom WritableType type. > > Something like WritableType with a custom converter > > or inputFn/outputFn functions. > > > > Regretfully there are several issues with this approach and instead I'd > > have to duplicate all those type classes for a new type set > > * WritableType has a package visible constructor so I cannot extend it > and > > cannot instantiate it > > * Converter is instantiated inside WritableType constructor so in case I > > need a different converter I'm stuck > > * Writables has a factory method for WritableType but it's private > > * it looks like there is an attempt to support additional WritableTypes > > through EXTENSIONS in Writables but it would only work for cases where in > > WritableType both T and W are hadoop writables > > > > So what do you think is a best solution, is it possible to open up the > api > > to support custom WritableTypes or the only option for me is to > implement a > > new ClojurePType and all related classes? > > > > Hope I'm not too detailed, but at this stage you all are probably very > > familiar with the code > > > > Thanks, > > Victor > > > --047d7b678494dcd43104ce88480b--