incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Iacoban <victor.iaco...@gmail.com>
Subject Re: extending crunch
Date Thu, 15 Nov 2012 13:04:00 GMT
Thanks Josh, will give this a try


On Wed, Nov 14, 2012 at 9:54 PM, Josh Wills <josh.wills@gmail.com> wrote:

> I'm always glad to help people to extend Crunch in ways that are useful for
> them. I think that most things that involve type-related extensions can be
> handled using the PTypes.derived() function, which can be used to create
> custom PTypes that are mapped to underlying serialized types, so that you
> could do something like
>
> // Forgive my syntax errors, I'm doing this w/o an IDE
> PType<Object> objectType = PTypes.derived(Object.class, new
> InputMapFn<BytesWritable, Object>(), new OutputMapFn<Object,
> BytesWritable>(), Writables.writables(BytesWritable.class));
>
> ...which is essentially how Scrunch works: the PTypes { } functionality in
> Scrunch maps from Scala types to Java types using the derived
> functionality.
>
> The Converter stuff is internal to Avro and Writable, I can't think of a
> case where that would need to be exposed outside the package (i.e., once
> you've decided on whether to use Writables or Avro as your serialization
> framework, the choice of Converter is fixed.)
>
> If you have a use case where the derived type can't handle the conversion
> or is a poor choice for whatever reason, I'm all about having a discussion
> and trying out different designs.
>
> Josh
>
>
> On Wed, Nov 14, 2012 at 6:18 PM, Victor Iacoban <victor.iacoban@gmail.com
> >wrote:
>
> > Hi,
> >
> > I'm very interested in writing a wrapper library around Apache Crunch for
> > Clojure, something similar to existing Scrunch.
> > How do you recommend to start?
> >
> > I was looking through Crunch code and it looks like I can pretty easily
> > integrate it in clojure by adding some custom WritableType type.
> > Something like WritableType<Object, ByteWritable> with a custom converter
> > or inputFn/outputFn functions.
> >
> > Regretfully there are several issues with this approach and instead I'd
> > have to duplicate all those type classes for a new type set
> > * WritableType has a package visible constructor so I cannot extend it
> and
> > cannot instantiate it
> > * Converter is instantiated inside WritableType constructor so in case I
> > need a different converter I'm stuck
> > * Writables has a factory method for WritableType but it's private
> > * it looks like there is an attempt to support additional WritableTypes
> > through EXTENSIONS in Writables but it would only work for cases where in
> > WritableType<T, W> both T and W are hadoop writables
> >
> > So what do you think is a best solution, is it possible to open up the
> api
> > to support custom WritableTypes or the only option for me is to
> implement a
> > new ClojurePType and all related classes?
> >
> > Hope I'm not too detailed, but at this stage you all are probably very
> > familiar with the code
> >
> > Thanks,
> > Victor
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message