crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Generic class for converting PCollection to PTable
Date Thu, 20 Feb 2014 06:07:35 GMT


> On 20 Feb 2014, at 05:11, Jinal Shah <jinalshah2007@gmail.com> wrote:
> 
> I didn't knew that, but I was more talking about something like this
> PCollection<V> to  PTable<K,V> basically.
> 

I think what you want is the PCollection#by method. It takes a MapFn that maps each value
V to a key, and returns a PTable<K,V>  

- Gabriel

> 
> 
>> On Wed, Feb 19, 2014 at 5:49 PM, Josh Wills <jwills@cloudera.com> wrote:
>> 
>> org.apache.crunch.lib.PTables.asPTable is likely what you want.
>> 
>> 
>> On Wed, Feb 19, 2014 at 3:47 PM, Jinal Shah <jinalshah2007@gmail.com>
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> Is there a generic way of converting PCollection to PTable? If not, Can
>> we
>>> create a generic class? Because we are having lot of places where we want
>>> to perform a join on 2 PCollections so we have to convert it into PTables
>>> and then do a join and then convert it into a PCollection. So i was
>>> wondering is there a better way of doing this.
>>> 
>>> Thanks
>> 
>> 
>> 
>> --
>> Director of Data Science
>> Cloudera <http://www.cloudera.com>
>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>> 

Mime
View raw message