hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bwolen Yang" <wbwo...@gmail.com>
Subject Re: \r\n problem in LineRecordReader.java
Date Wed, 13 Jun 2007 15:43:24 GMT
> taking values at runtime (i have it thru exceptions when the result is
> 0 and print out he values).

the \r\n problem was observed on the 0.13.0 release.
To study the behavior, I instrument the hadoop source from the head of the tree.

More specifically, attached are two sample stacks.  (i have readbuffer
throw when it gets 0 bytes, and have inputchecker catches the
exception and rethrow both.  This way, I catch the values from both
caller and callee.

on a separate note, if (len>=bytesPerSum) the assumption exists, would
it be ok to throw exceptions when violated?   most of time (e.g., in
crawl/indexing), people won't notice some part of input data is
getting throw away.   It would be a lot easier to debug as code
changes (and assumption get violated), and the cost in this case is
probably not too bad as good part of the cost is probably in networks
and going to disk.

bwolen
-------------------------------------

java.lang.RuntimeException: end of read()
in=org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker len=127
pos=45223932 res=-999999
	at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:50)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
	at org.apache.hadoop.fs.FSDataInputStream$Buffer.read(FSDataInputStream.java:116)
	at java.io.FilterInputStream.read(FilterInputStream.java:66)
	at org.apache.hadoop.mapred.LineRecordReader.readLine(LineRecordReader.java:132)
	at org.apache.hadoop.mapred.LineRecordReader.readLine(LineRecordReader.java:124)
	at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:108)
	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:168)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:44)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1720)
Caused by: java.lang.RuntimeException: end of read()
datas=org.apache.hadoop.dfs.DFSClient$DFSDataInputStream pos=45223932
len=-381 bytesPerSum=512 eof=false read=0
	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:200)
	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:175)
	at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:47)
	... 11 more

-------------------------
java.lang.RuntimeException: end of read()
in=org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker len=127
pos=45223932 res=-999999
	at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:50)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
	at org.apache.hadoop.fs.FSDataInputStream$Buffer.read(FSDataInputStream.java:116)
	at java.io.FilterInputStream.read(FilterInputStream.java:66)
	at org.apache.hadoop.mapred.LineRecordReader.readLine(LineRecordReader.java:132)
	at org.apache.hadoop.mapred.LineRecordReader.readLine(LineRecordReader.java:124)
	at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:108)
	at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:168)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:44)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1720)
Caused by: java.lang.RuntimeException: end of read()
datas=org.apache.hadoop.dfs.DFSClient$DFSDataInputStream pos=45223932
len=-381 bytesPerSum=512 eof=false read=0
	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:200)
	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:175)
	at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:47)
	... 11 more

Mime
View raw message