crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <mkw...@gmail.com>
Subject Re: Process of CombineFn<S,T> returns <S,U>?
Date Thu, 17 Oct 2013 20:48:01 GMT
Chandan,
   I think what you are wanting will just be a simple MapFn instead of a
CombineFn.  The doc of the CombineFn[1] sounds like what you want with the
statement "A special
DoFn<http://crunch.apache.org/apidocs/0.7.0/org/apache/crunch/DoFn.html>
implementation
that converts an
Iterable<http://download.oracle.com/javase/6/docs/api/java/lang/Iterable.html?is-external=true>
of
values into a single value" but it is expecting the value to be of the same
time.  Since you are wanting to combine the values into a different form it
should be fairly trivial to write a MapFn that converts the Iterable<T> ->
U.

[1] -
http://crunch.apache.org/apidocs/0.7.0/org/apache/crunch/CombineFn.html


On Thu, Oct 17, 2013 at 3:30 PM, Chandan Biswas <cbiswas1983@gmail.com>wrote:

> I was trying to refactoring some stuffs and trying to use combineFn.
> But when I went into deeper, found that I can't do it as Crunch doesn't
> allow it the functionality I needed. For example, I have a
> PGroupedTable<S,T>. I wanted to apply CombineFn<S,T> on it and wanted to
> get PCollection<S,U> instead of T. Right now, CombineFn allows only same
> type as return value. The use case of this need is that there will be some
> time saving in sorting. It's natural that when aggregating some objects at
> map side can create a new different type object.
>
> Any thought on it? Am I missing any thing? If this can be written in
> different way using existing way please let me know.
>
> Thanks
> Chandan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message