avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phan, Truong Q" <Troung.P...@team.telstra.com>
Subject RE: How to write a "custom" file name in the Hadoop's MapReduce with Avro format
Date Thu, 20 Mar 2014 03:29:46 GMT
Hi Lewis,

Thanks for the response to my questions.
I have seen a lot of user's questions where they didn't provide their objectives, source codes,
issues and questions.
The helpers were flaring up and demanded all the above information.

Damned if you do, damned if you don't. :)

All  links mentioned in the below link are not working and displayed the error message below.

BTW, I am using Python not Java to code Avro.
<snip>
1)      Can I control the filename http://wiki.apache.org/hadoop/FAQ#How_do_I_change_final_output_file_name_with_the_desired_name_rather_than_in_partitions_like_part-00000.2C_part-00001.3F
</snip>

<snip>
An Exception Has Occurred
Unknown location: /hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/TextOutputFormat.java
HTTP Response Status
404 Not Found
</snip>

Here are my Avro/Python/MapReduce question/request:

1)      If I am not using Hadoop's MapReduce Streaming then the Avro's DataFileWriter method
will write data into my "custom" filenames. However, If I am using the Hadoop's MapReduce
Streaming then the Avro's DataFileWriter method will create an emptied files with the Hadoop's
default filenames (part-0000*) into the HDFS. Strangely, Avro's DataFileWriter method will
create an emptied files with Hadoop's default filename (part-00000*). How dow I use Avro's
DataFileWrite method in Python to write data into my custom file name in HDFS?

2)      Do you have Python's sample codes to control the filename and location to put our
Avro's files into the HDFS?


Thanks and Regards,
Truong Phan


P    + 61 2 8576 5771
M   + 61 4 1463 7424
E    troung.phan@team.telstra.com
W  www.telstra.com



From: Lewis John Mcgibbney [mailto:lewis.mcgibbney@gmail.com]
Sent: Thursday, 20 March 2014 1:39 PM
To: user@avro.apache.org
Subject: Re: How to write a "custom" file name in the Hadoop's MapReduce with Avro format

Hi,

This is a rather troublesome email with lots of unnecessary material included. I think you
may get more help if you try and refine your question(s).
Anyhow...
On Tue, Mar 18, 2014 at 1:27 AM, Phan, Truong Q <Troung.Phan@team.telstra.com<mailto:Troung.Phan@team.telstra.com>>
wrote:

Questions:

1)      Can I control the filename
http://wiki.apache.org/hadoop/FAQ#How_do_I_change_final_output_file_name_with_the_desired_name_rather_than_in_partitions_like_part-00000.2C_part-00001.3F


and location to put our Avro's files in the HDFS?

Yes. FileOutputFormat defines a #setOutputPath method which enables you to define the location
the file is written to. You can use this through AvroOutputFormat.


2)      The Hadoop's MapReduce Streaming project has a MapDebug (-mapdebug) and ReduceDebug
(-reducedebug) options but I can't get any debug message for my Map's debug

Please go to Hadoop lists for this.
hope this helps a bit
Lewis

Mime
View raw message