hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: Issue distcp'ing from 0.19.2 to 0.18.3
Date Thu, 09 Apr 2009 15:56:49 GMT
Ah, nevermind. It turns out that I just shouldn't rely on command  
history so much. I accidentally pointed the hftp:// at the actual  
namenode port, not the namenode HTTP port. It appears to be starting  
a regular copy again.

-Bryan

On Apr 8, 2009, at 11:57 PM, Todd Lipcon wrote:

> Hey Bryan,
>
> Any chance you can get a tshark trace on the 0.19 namenode? Maybe  
> tshark -s
> 100000 -w nndump.pcap port 7276
>
> Also, are the clocks synced on the two machines? The failure of  
> your distcp
> is at 23:32:39, but the namenode log message you posted was  
> 23:29:09. Did
> those messages actually pop out at the same time?
>
> Thanks
> -Todd
>
> On Wed, Apr 8, 2009 at 11:39 PM, Bryan Duxbury <bryan@rapleaf.com>  
> wrote:
>
>> Hey all,
>>
>> I was trying to copy some data from our cluster on 0.19.2 to a new  
>> cluster
>> on 0.18.3 by using disctp and the hftp:// filesystem. Everything  
>> seemed to
>> be going fine for a few hours, but then a few tasks failed because  
>> a few
>> files got 500 errors when trying to be read from the 19 cluster.  
>> As a result
>> the job died. Now that I'm trying to restart it, I get this error:
>>
>> [rapleaf@ds-nn2 ~]$ hadoop distcp hftp://ds-nn1:7276/
>> hdfs://ds-nn2:7276/cluster-a
>> 09/04/08 23:32:39 INFO tools.DistCp: srcPaths=[hftp://ds-nn1:7276/]
>> 09/04/08 23:32:39 INFO tools.DistCp: destPath=hdfs://ds-nn2:7276/ 
>> cluster-a
>> With failures, global counters are inaccurate; consider running  
>> with -i
>> Copy failed: java.net.SocketException: Unexpected end of file from  
>> server
>>        at sun.net.www.http.HttpClient.parseHTTPHeader 
>> (HttpClient.java:769)
>>        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
>>        at sun.net.www.http.HttpClient.parseHTTPHeader 
>> (HttpClient.java:766)
>>        at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
>>        at
>> sun.net.www.protocol.http.HttpURLConnection.getInputStream 
>> (HttpURLConnection.java:1000)
>>        at
>> org.apache.hadoop.dfs.HftpFileSystem$LsParser.fetchList 
>> (HftpFileSystem.java:183)
>>        at
>> org.apache.hadoop.dfs.HftpFileSystem$LsParser.getFileStatus 
>> (HftpFileSystem.java:193)
>>        at
>> org.apache.hadoop.dfs.HftpFileSystem.getFileStatus 
>> (HftpFileSystem.java:222)
>>        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
>>        at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java: 
>> 588)
>>        at org.apache.hadoop.tools.DistCp.copy(DistCp.java:609)
>>        at org.apache.hadoop.tools.DistCp.run(DistCp.java:768)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>        at org.apache.hadoop.tools.DistCp.main(DistCp.java:788)
>>
>> I changed nothing at all between the first attempt and the subsequent
>> failed attempts. The only clues in the namenode log for the 19  
>> cluster are:
>>
>> 2009-04-08 23:29:09,786 WARN org.apache.hadoop.ipc.Server:  
>> Incorrect header
>> or version mismatch from 10.100.50.252:47733 got version 47 expected
>> version 2
>>
>> Anyone have any ideas?
>>
>> -Bryan
>>


Mime
View raw message