avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: How to write a "custom" file name in the Hadoop's MapReduce with Avro format
Date Thu, 20 Mar 2014 09:06:18 GMT
Hi,

On Thu, Mar 20, 2014 at 3:29 AM, Phan, Truong Q <
Troung.Phan@team.telstra.com> wrote:

> Damned if you do, damned if you don't. J
>

;) we will get there eventually. All I meant that it seemed a very long
message for what you seem to have summarized very well below.


> All  links mentioned in the below link are not working and displayed the
> error message below.
>
> BTW, I am using Python not Java to code Avro.
>
> <snip>
>
> 1)      Can I control the filename
> http://wiki.apache.org/hadoop/FAQ#How_do_I_change_final_output_file_name_with_the_desired_name_rather_than_in_partitions_like_part-00000.2C_part-00001.3F
> </snip>
>
> <snip>
>
> *An Exception Has Occurred*
>
> Unknown location:
> /hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/TextOutputFormat.java
>
> *HTTP Response Status*
>
> 404 Not Found
>
> </snip>
>

I've now fixed this and you should be able to access the links no problem.


>
>
> Here are my Avro/Python/MapReduce question/request:
>
> 1)      If I am not using Hadoop's MapReduce Streaming then the Avro's
> DataFileWriter method will write data into my "custom" filenames. However,
> If I am using the Hadoop's MapReduce Streaming then the Avro's
> DataFileWriter method will create an emptied files with the Hadoop's
> default filenames (part-0000*) into the HDFS. Strangely, Avro's
> DataFileWriter method will create an emptied files with Hadoop's default
> filename (part-00000*). How dow I use Avro's DataFileWrite method in Python
> to write data into my custom file name in HDFS?
>
OK so as I explained. You need to ensure that you create a class which
implements OutputFormat enabling you to change the resulting file name. I
think that you have no way of achieving this unless you code some.


> 2)      Do you have Python's sample codes to control the filename and
> location to put our Avro's files into the HDFS?
>
 Off the top of my head no I don't sorry. Maybe someone else can help you
out here.

Mime
View raw message