hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject Issues with 0.14.0...
Date Thu, 23 Aug 2007 20:05:44 GMT
Hi All:
   
  I tried 0.14.0 today with limited success.  0.13.0 was doing pretty well, but I'm not able
to get as far with 0.14.0.
   
  My environment is single-node, 4way box, 8G memory, 500G disk space.
   
  First up is an out-of-memory error.  The dataset is 1,000,000 rows (but only 60M in size),
and I'm running 4 aggregations (using a plugin which extends ValueAggregatorBaseDescriptor
and does on UNIQ_VALUE_COUNT and 3 LONG_VALUE_SUM aggregators).  This previously worked on
0.13.0.:
   
  java.lang.OutOfMemoryError: Java heap space
  at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:95)
 at java.io.DataOutputStream.write(DataOutputStream.java:90)
 at org.apache.hadoop.io.Text.write(Text.java:243)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:338)
 at org.apache.hadoop.mapred.lib.aggregate.ValueAggregatorMapper.map(ValueAggregatorMapper.java:48)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
 at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1778)
   
  Second issue is a failure on copyFromLocal with lost connections.  I'm trying to copy a
5.8G,  88,784,045 million row file to HDFS.  It makes progress for a while, but at  approx
2.1 gigs copied, it dies with a repeated series of errors.  There is 470G free on the file
system.  The error is repeated several times and is:
  $ bin/hadoop dfs -copyFromLocal sample.dat /input/sample.dat
07/08/23 15:58:10 WARN fs.DFSClient: Error while writing.
java.net.SocketException: Connection reset
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1656)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:1610)
        at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:140)
        at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:100)
        at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:39)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.fs.FileUtil.copyContent(FileUtil.java:258)
        at org.apache.hadoop.fs.FileUtil.copyContent(FileUtil.java:248)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:133)
        at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:776)
        at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:757)
        at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:116)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:1229)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:1342)
   
  The following error also appears several times in the datanode logs:
  2007-08-23 15:58:10,072 ERROR org.apache.hadoop.dfs.DataNode: DataXceiver: java.io.IOException:
Unexpected checksum mismatch while writing blk_1461965301876815406 from /xxx.xxx.xxx.xx:50960
        at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:902)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
        at java.lang.Thread.run(Thread.java:595)

   
  Any help on these issues much appreciated.

       
---------------------------------
Luggage? GPS? Comic books? 
Check out fitting  gifts for grads at Yahoo! Search.
       
---------------------------------
Pinpoint customers who are looking for what you sell. 
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message