crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nipur Patodi <>
Subject Multiple output from crunch
Date Mon, 06 Jul 2015 18:47:39 GMT
Hi All,

I am very new to crunch.

I am trying to read data from csv file using MR pipelines. I need to
convert and  bucketize this data on the bases of time stamp which is a
field in csv.  I need to write data per timestamp in to single file.

This scenario is equivalent to writing values (record) per key (which is
time stamp) to different file.

I can achieve this using multiple output format in mapreduce.

Do we have any equivalent concept or design pattern to achieve same
behavior using crunch?

Please suggest.



View raw message