hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Fileformat query
Date Thu, 28 Jan 2010 15:35:53 GMT
On Thu, Jan 28, 2010 at 4:01 AM, Udaya Lakshmi <udaya603@gmail.com> wrote:
> Hi all..
>   I have searched the documentation but could not find a input file
> format which will give line number as the key and line as the value.
> Did I miss something? Can someone give me a clue of how to implement
> one such input file format.
> Thanks,
> Udaya.


When using the standard File Input Format:

public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter reporter) throws IOException {

key represents the byte offset of the key in the input file. There is
no easy way for translate the byte offset to a logical line number,
unless all lines were fixed width (not usually the case)


View raw message