hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hari708 <hari...@gmail.com>
Subject hadoop File loading
Date Tue, 22 Nov 2011 01:20:38 GMT

I have a big file consisting of XML data.the XML is not represented as a
single line in the file. if we stream this file using ./hadoop dfs -put
command to a hadoop directory .How the distribution happens.?
Basically in My mapreduce program i am expecting a complete XML as my
input.i have a CustomReader(for XML) in my mapreduce job configuration.My
main confusion is if namenode distribute data to DataNodes ,there is a
chance that a part of xml can go to one data node and other half can go in
another datanode.If that is the case will my custom XMLReader in the
mapreduce be able to combine it(as mapreduce reads data locally only).
Please help me on this?
View this message in context: http://old.nabble.com/hadoop-File-loading-tp32871902p32871902.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message