hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <jason.had...@gmail.com>
Subject Re: CHANGING FINAL OUTPUT FILE NAME
Date Fri, 02 Oct 2009 01:05:04 GMT
I see these ways to go here.

   1.  The one I know to work is to create a recordwriter in the configure
   method of your task, in the per task work/output directory, and then rename
   it to your chosen name in the close. your task calls write on the
   recordwriter directly instead of output.collect
   2. Use the multi output format
   3. in the close method of the task, rename the part-xxx to your name. I
   am not certain that this is safe in the close method of the task
   4. define a custom OutputCommitter class which renames the file to your
   chosen name.




On Thu, Oct 1, 2009 at 1:00 PM, Alberto Luengo Cabanillas <cabiwan@gmail.com
> wrote:

> Hi everyone! I have a newbie question: I´m actually using Hadoop 0.20.1 and
> I´d like to know how can I change the name of the resulting file with the
> one I want (i.e from "part-r-00000" to "myoutput"). I´ve found something
> related in JIRA (https://issues.apache.org/jira/browse/MAPREDUCE-370) but
> I
> don´t know for sure i that is my problem too. In this case, do I apply the
> patch over the affected file and I´m ready to go or do I need to do
> something more later?
> Thanks a lot!
>
> --
> Alberto
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message