hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Angadi <rang...@yahoo-inc.com>
Subject Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?
Date Mon, 07 Jul 2008 18:13:30 GMT
ConcurrentModificationException looks like a bug we should file a jira.

Regd why the writes are failing, we need to look at more logs.. Could 
you attach complete log from one of the failed tasks. Also try to see if 
there is anything in NameNode log around that time.

Raghu.

C G wrote:
> Hi All:
>  
> I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master running namenode).
 I'm trying to process a small (180G) dataset.  I've done this succesfully and painlessly
running 0.15.0.  When I run 0.17.0 with the same data and same code (w/API changes for 0.17.0
and recompiled, of course), I get a ton of failures.  I've increased the number of namenode
threads trying to resolve this, but that doesn't seem to help.  The errors are of the following
flavor:
>  
> java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting...
> Exception in thread "Thread-2" java.util.ConcurrentModificationException
> Exception closing file /blah/_temporary/_task_200807052311_0001_r_0000
> 04_0/baz/part-xxxxx
>  
> As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1).  I am wondering
if anybody can shed some light on this, or if others are having similar problems.  
>  
> Any thoughts, insights, etc. would be greatly appreciated.
>  
> Thanks,
> C G
>  
> Here's an ugly trace:
> 08/07/06 01:43:29 INFO mapred.JobClient:  map 100% reduce 93%
> 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : task_200807052311_0001_r_000003_0,
Status : FAILED
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
> task_200807052311_0001_r_000003_0: Exception closing file /output/_temporary/_task_200807052311_0001_r_0000
> 03_0/a/b/part-00003
> task_200807052311_0001_r_000003_0: java.io.IOException: All datanodes 10.2.11.2:50010
are bad. Aborting...
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja
> va:2095)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1
> 818)
> task_200807052311_0001_r_000003_0: Exception in thread "Thread-2" java.util..ConcurrentModificationException
> task_200807052311_0001_r_000003_0:      at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> task_200807052311_0001_r_000003_0:      at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
> task_200807052311_0001_r_000003_0:      at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
> 08/07/06 01:44:32 INFO mapred.JobClient:  map 100% reduce 74%
> 08/07/06 01:44:32 INFO mapred.JobClient: Task Id : task_200807052311_0001_r_000001_0,
Status : FAILED
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)
> task_200807052311_0001_r_000001_0: Exception in thread "Thread-2" java.util..ConcurrentModificationException
> task_200807052311_0001_r_000001_0:      at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> task_200807052311_0001_r_000001_0:      at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154)
> task_200807052311_0001_r_000001_0:      at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217)
> task_200807052311_0001_r_000001_0:      at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214)
> task_200807052311_0001_r_000001_0:      at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324)
> task_200807052311_0001_r_000001_0:      at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224)
> task_200807052311_0001_r_000001_0:      at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209)
> 08/07/06 01:44:45 INFO mapred.JobClient:  map 100% reduce 54%
> 
> 
> 
>       


Mime
View raw message