hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 皮皮 <pi.bingf...@gmail.com>
Subject Re: Multipleoutput file
Date Fri, 22 May 2009 00:58:20 GMT
thank you for you reply, jason.

well , how should i do if i just want to get certain file in the directory ,
not all of the files?

2009/5/21 jason hadoop <jason.hadoop@gmail.com>

> setInputPaths will take an array, or variable arguments.
> or you can simply provide the directory that the individual files reside
> in,
> and the individual files will be added.
>
> If there are other files in the directory, you may need to specify a custom
> input path filter via FileInputFormat.setInputPathFilter.
>
>
> 2009/5/21 皮皮 <pi.bingfeng@gmail.com>
>
> > yes , but how can i get the commaSeperatedPaths? As i can't specify it
> > handy.
> >
> > it's not practicable to do that:
> >
> > commaSeperatedPaths_1 = "MAPPINGOUTPUT-r-00001";
> > commaSeperatedPaths_2 = "MAPPINGOUTPUT-r-00002";
> >
> > FileInputFormat.setInputPaths(job, commaSeperatedPaths_1);
> > FileInputFormat.setInputPaths(job, commaSeperatedPaths_2);
> >
> >
> >
> > 2009/4/7 Brian MacKay <Brian.MacKay@medecision.com>
> >
> > >
> > > Not sure about your question:  seems like you'd like to do this...?
> > >
> > > After you run job, your output may be like MAPPINGOUTPUT-r-00001,
> > > MAPPINGOUTPUT-r-00002, etc.
> > >
> > > You'd need to set them as multiple inputs.
> > >
> > > FileInputFormat.setInputPaths(job, commaSeperatedPaths);
> > >
> > >
> > > Brian
> > >
> > > -----Original Message-----
> > > From: 皮皮 [mailto:pi.bingfeng@gmail.com]
> > > Sent: Tuesday, April 07, 2009 3:30 AM
> > > To: core-user@hadoop.apache.org
> > > Subject: Re: Multiple k,v pairs from a single map - possible?
> > >
> > > could any body tell me how to get one of the multipleoutput file in
> > another
> > > jobconfig?
> > >
> > > 2009/4/3 皮皮 <pi.bingfeng@gmail.com>
> > >
> > > > thank you very much . this is what i am looking for.
> > > >
> > > > 2009/3/27 Brian MacKay <Brian.MacKay@medecision.com>
> > > >
> > > >
> > > >> Amandeep,
> > > >>
> > > >> Add this to your driver.....
> > > >>
> > > >> MultipleOutputs.addNamedOutput(conf, "PHONE",TextOutputFormat.class,
> > > >> Text.class, Text.class);
> > > >>
> > > >> MultipleOutputs.addNamedOutput(conf, "NAME,
> > > >>                    TextOutputFormat.class, Text.class, Text.class);
> > > >>
> > > >>
> > > >>
> > > >> And in your reducer....
> > > >>
> > > >>  private MultipleOutputs mos;
> > > >>
> > > >> public void reduce(Text key, Iterator<Text> values,
> > > >>            OutputCollector<Text, Text> output, Reporter reporter)
{
> > > >>
> > > >>
> > > >>          // namedOutPut = either PHONE or NAME
> > > >>
> > > >>        while (values.hasNext()) {
> > > >>            String value = values.next().toString();
> > > >>            mos.getCollector(namedOutPut, reporter).collect(
> > > >>                    new Text(value), new Text(othervals));
> > > >>        }
> > > >>    }
> > > >>
> > > >>    @Override
> > > >>    public void configure(JobConf conf) {
> > > >>        super.configure(conf);
> > > >>        mos = new MultipleOutputs(conf);
> > > >>    }
> > > >>
> > > >>    public void close() throws IOException {
> > > >>        mos.close();
> > > >>    }
> > > >>
> > > >>
> > > >>
> > > >> By the way, have you had a change to post your Oracle fix to
> > > >> DBInputFormat ?
> > > >> If so, what is the Jira tag #?
> > > >>
> > > >> Brian
> > > >>
> > > >> -----Original Message-----
> > > >> From: Amandeep Khurana [mailto:amansk@gmail.com]
> > > >> Sent: Friday, March 27, 2009 5:46 AM
> > > >> To: core-user@hadoop.apache.org
> > > >> Subject: Multiple k,v pairs from a single map - possible?
> > > >>
> > > >> Is it possible to output multiple key value pairs from a single map
> > > >> function
> > > >> run?
> > > >>
> > > >> For example, the mapper outputing <name,phone> and <name,
address>
> > > >> simultaneously...
> > > >>
> > > >> Can I write multiple output.collect(...) commands?
> > > >>
> > > >> Amandeep
> > > >>
> > > >> Amandeep Khurana
> > > >> Computer Science Graduate Student
> > > >> University of California, Santa Cruz
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_
> _
> > _
> > > _
> > > >> _
> > > >>
> > > >> The information transmitted is intended only for the person or
> entity
> > to
> > > >> which it is addressed and may contain confidential and/or privileged
> > > >> material. Any review, retransmission, dissemination or other use of,
> > or
> > > >> taking of any action in reliance upon, this information by persons
> or
> > > >> entities other than the intended recipient is prohibited. If you
> > > received
> > > >> this message in error, please contact the sender and delete the
> > material
> > > >> from any computer.
> > > >>
> > > >>
> > > >>
> > > >
> > > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> _
> > _
> > >
> > > The information transmitted is intended only for the person or entity
> to
> > > which it is addressed and may contain confidential and/or privileged
> > > material. Any review, retransmission, dissemination or other use of, or
> > > taking of any action in reliance upon, this information by persons or
> > > entities other than the intended recipient is prohibited. If you
> received
> > > this message in error, please contact the sender and delete the
> material
> > > from any computer.
> > >
> > >
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
> www.prohadoopbook.com a community for Hadoop Professionals
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message