hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rex X <dnsr...@gmail.com>
Subject Re: Hadoop Streaming: How to parition output into subfolders?
Date Fri, 22 Jan 2016 05:19:46 GMT
Hi Camusensei,

Thank you. That's very helpful!

Rex


On Thu, Jan 21, 2016 at 1:41 AM, Namikaze Minato <lloydsensei@gmail.com>
wrote:

> Hi Rex X,
>
> We are using the -outputFormat <classname> option of hadoop-streaming.
> Here is the detail: http://www.infoq.com/articles/HadoopOutputFormat
>
> Regards,
> Camusensei
>
> On 21 January 2016 at 07:21, Rex X <dnsring@gmail.com> wrote:
> > Thank you, Rohit!
> >
> > Any multiple outputs sample code in python?
> >
> > Rex
> >
> >
> > On Wed, Jan 20, 2016 at 10:04 PM, rohit sarewar <rohitsarewar@gmail.com>
> > wrote:
> >>
> >> Hi Rex
> >>
> >> Please explore multiple outputs.
> >>
> >> Regards
> >> Rohit Sarewar
> >>
> >>
> >> On Thu, Jan 21, 2016 at 5:13 AM, Rex X <dnsring@gmail.com> wrote:
> >>>
> >>> Dear all,
> >>>
> >>> To be specific, for example, given
> >>>
> >>>     hadoop jar hadoop-streaming.jar \
> >>>       -input myInputDirs \
> >>>       -output myOutputDir \
> >>>       -mapper /bin/cat \
> >>>       -reducer /usr/bin/wc
> >>>
> >>> Where myInputDirs has a dated subfolder structure of
> >>>
> >>>        /input_dir/yyyy/mm/dd/part-*
> >>>
> >>> I want myOutputDir has the same dated subfolder structure:
> >>>
> >>>        /output_dir/yyyy/mm/dd/part-*
> >>>
> >>> Guess there should be an option to do this. Can "-partitioner" or any
> >>> "-D" option make this?
> >>>
> >>>
> >>> Thanks & regards,
> >>> Rex
> >>
> >>
> >
>

Mime
View raw message