hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bejoy KS" <bejoy.had...@gmail.com>
Subject Re: suggest Best way to upload xml files to HDFS
Date Fri, 13 Jul 2012 04:45:00 GMT
Hi Manoj

If you are looking at a scheduler and a work flow manager to carry out this task you can have
a look at oozie.

If your xml files are smaller(smaller than hdfs block size) then definitely it is a better
practice to combine them to form larger files. Combining into Sequence Files should be good.

Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: Manoj Babu <manoj444@gmail.com>
Date: Fri, 13 Jul 2012 08:59:51 
To: <mapreduce-user@hadoop.apache.org>
Reply-To: mapreduce-user@hadoop.apache.org
Subject: suggest Best way to upload xml files to HDFS


I need to upload large xml files files daily. Right now am having a small
program to read all the files from local folder and writing it to HDFS as a
single file. Is this a right way?
If there any best practices or optimized way to achieve this Kindly let me

Thanks in advance!


View raw message