hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xavier Stevens <xstev...@mozilla.com>
Subject Re: Hadoop XML Error
Date Mon, 07 Feb 2011 17:50:57 GMT

I've seen this when a directory has been removed or is missing from the
time distcp starting stating the source files.  You'll probably want to
make sure that no code or person is messing with the filesystem during
your copy.  Also you should use hdfs as the destination protocol.



On 2/7/11 7:51 AM, Korb, Michael [USA] wrote:
> I am running two instances of Hadoop on a cluster and want to copy all the data from
hadoop1 to the updated hadoop2. From hadoop2, I am running the command "hadoop distcp -update
hftp://mc00001:50070/ hftp://mc00000:50070/" where mc00001 is the namenode of hadoop1 and
mc00000 is the namenode of hadoop2. I get the following error:
> 11/02/07 10:12:31 INFO tools.DistCp: srcPaths=[hftp://mc00001:50070/]
> 11/02/07 10:12:31 INFO tools.DistCp: destPath=hftp://mc00000:50070/
> [Fatal Error] :1:215: XML document structures must start and end within the same entity.
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.io.IOException: invalid xml directory content
> 	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:350)
> 	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.getFileStatus(HftpFileSystem.java:355)
> 	at org.apache.hadoop.hdfs.HftpFileSystem.getFileStatus(HftpFileSystem.java:384)
> 	at org.apache.hadoop.tools.DistCp.sameFile(DistCp.java:1227)
> 	at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1120)
> 	at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
> 	at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> 	at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
> Caused by: org.xml.sax.SAXParseException: XML document structures must start and end
within the same entity.
> 	at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1231)
> 	at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.fetchList(HftpFileSystem.java:344)
> 	... 9 more
> I am fairly certain that none of the XML files are malformed or corrupted. This thread
(http://www.mail-archive.com/core-dev@hadoop.apache.org/msg18064.html) discusses a similar
problem caused by file permissions but doesn't seem to offer a solution. Any help would be
> Thanks,
> Mike

View raw message