hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From java8964 <java8...@hotmail.com>
Subject RE: Question about the usage of Seekable within the LineRecordReader
Date Wed, 19 Feb 2014 19:55:51 GMT
Hi, Brian:
I hope I understand your question correctly. Here is my view what provided from the Seekable
The Seekable interface also defines the "seek(long pos)" method, which allows the client to
seek to a specified position in the underline InputStream.
In the RecordReader, it will get the start position and an instance of the inputSplit, but
the underline input stream is not open or available yet.
The RecordReader will find the correct start position of the stream, and use Seekable interface
to "seek" the specified start position, and start to read the bytes from there, to translates
following bytes data into  <K, V> pairs.
Without Seekable interface, there is no way to "seek" to the correct starting position.

Date: Wed, 19 Feb 2014 14:39:00 -0500
Subject: Question about the usage of Seekable within the LineRecordReader
From: bstempin@rightaction.com
To: user@hadoop.apache.org

Hi List,In order to write my own record reader, I'm taking a look at the LineRecordReader
in v 2.2.0.  I notice that it uses Seekable in order to tell where it is in the file when
using something other than an InputStream.  As far as I can see, the only reason its used
is to get the current position within the file (within getFilePosition() ).

My question is:  Why?  It looks like the file position is already tracked by the pos field.
 Is there a reason to use Seekable.getPos() instead of looking at pos?

Thanks for the help,Brian 		 	   		  
View raw message