hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Panayotis Antonopoulos <antonopoulos...@hotmail.com>
Subject RE: MultipleOutputs Files remain in temporary folder
Date Tue, 31 May 2011 02:37:27 GMT

I forgot to mention that there is no _SUCCESS folder in the output directory.















Thanks for your replies!

I use TableOutputFormat to delete entries from a HBase Table and MO for HFileOutputFormat.
Until yesterday I used normal HFileOutputFormat output (not MO) and the files appeared in
the output directory and not in the temporary folder.
I performed the deletes manually inside the reducer.

Then I decided to use MO for the HFiles and TableOutputFormat for the deletes.
Now the Hfiles remain in the temporary folder.

I think that TableOutputFormat  uses its own committer and I do not explicitly change the
commiter.

Do you have anything to suggest?



> From: harsh@cloudera.com
> Date: Tue, 31 May 2011 01:20:57 +0530
> Subject: Re: MultipleOutputs Files remain in temporary folder
> To: mapreduce-user@hadoop.apache.org
> 
> Panayotis,
> 
> I've not seen this happen yet. I've regularly used MO to write my
> files and both TextFileO/F and NullO/F have worked fine despite me not
> writing a byte to their collectors. In fact, the test case for MO too
> passes when I modify it to never emit to the default output sink.
> 
> Are you using the default OutputCommitter (FileOutputCommitter)?
> 
> 2011/5/30 Panayotis Antonopoulos <antonopoulospan@hotmail.com>:
> > Hello,
> > I just noticed that the files that are created using MultipleOutputs remain
> > in the temporary folder into attempt sub-folders when there is no normal
> > output  (using context.write(...)).
> >
> > Has anyone else noticed that?
> > Is there any way to change that and make the files appear in the output
> > directory?
> >
> > Thank you in advance!
> > Panagiotis.
> >
> 
> 
> 
> -- 
> Harsh J
 		 	   		  
Mime
View raw message