crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nipur Patodi <er.nipur.pat...@gmail.com>
Subject Re: Multiple output from crunch
Date Mon, 06 Jul 2015 18:57:44 GMT
Thanks Much Josh,

Do we have something for avro parquet file also?

Thanks,

_Nipur



On Tue, Jul 7, 2015 at 12:17 AM, Nipur Patodi <er.nipur.patodi@gmail.com>
wrote:

> Hi All,
>
>
>
> I am very new to crunch.
>
>
> I am trying to read data from csv file using MR pipelines. I need to
> convert and  bucketize this data on the bases of time stamp which is a
> field in csv.  I need to write data per timestamp in to single file.
>
>
>
> This scenario is equivalent to writing values (record) per key (which is
> time stamp) to different file.
>
> I can achieve this using multiple output format in mapreduce.
>
>
>
> Do we have any equivalent concept or design pattern to achieve same
> behavior using crunch?
>
>
>
> Please suggest.
>
>
>
> Thanks,
>
>
>
> _Nipur
>

Mime
View raw message