hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brahma Reddy Battula <brahmareddy.batt...@huawei.com>
Subject RE: Distcp fails with "Got EOF but currentPos = 240377856 < filelength = 1026034162" error
Date Tue, 19 Jan 2016 05:04:11 GMT
Hi Buntu Dev,

Please check the Data node logs to get the exact root reason.
One more possible reason (apart from kai mentioned)can be direct buffer memory is not enough
while copying the large files. If you observe the OOM in direct buffer, just increase it..

Hope it’s helpful.



From: Buntu Dev [mailto:buntudev@gmail.com]
Sent: 19 January 2016 09:15
To: Zheng, Kai
Cc: user@hadoop.apache.org
Subject: Re: Distcp fails with "Got EOF but currentPos = 240377856 < filelength = 1026034162"
error

Thanks Kai, but I checked the parqet file that was reported to have issues and fsck says the
file is healthy.



On Mon, Jan 18, 2016 at 7:09 PM, Zheng, Kai <kai.zheng@intel.com<mailto:kai.zheng@intel.com>>
wrote:
Looks like a file it’s copying is ended unexpectedly. Maybe need to find out which file,
check or read the file in other means to ensure it’s fine not being corrupt.

Regards,
Kai

From: Buntu Dev [mailto:buntudev@gmail.com<mailto:buntudev@gmail.com>]
Sent: Tuesday, January 19, 2016 5:46 AM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Distcp fails with "Got EOF but currentPos = 240377856 < filelength = 1026034162"
error

I'm using distcp with these options to copy a hdfs directory from one cluster to another:

~~~~
hadoop distcp -prb -i -update -skipcrccheck -delete hftp://cluster1/user/hive/warehouse/dir1/
hdfs://cluster2/dir1/
~~~~

I keep running into these errors related to EOF, what could be causing these errors and how
to fix this:

~~~~~~~~~
Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.io.IOException:
Got EOF but currentPos = 240377856 < filelength = 1026034162
            at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:289)
            at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:257)
            at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:184)
            at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:124)
            at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
            at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
            ... 11 more
~~~~~~~~~~


Also I'm using the '-i' to ignore and continue on failures but the distcp does retry 3 times
and stops. Can anyone throw some light on what else could be going wrong.


Thanks!

Mime
View raw message