hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ishwar ramani <rvmish...@gmail.com>
Subject Re: retrieving sequenceFile Postion of Key in mapper task
Date Mon, 12 Oct 2009 17:23:17 GMT
thanks. that worked  fine ....


On Thu, Oct 8, 2009 at 10:45 PM, Ahad Rana <ahad@commoncrawl.org> wrote:
> Oops, memory fails me. To correct my previous statement, for block
> compressed files, getPosition reflects the position in the input stream of
> the NEXT compressed block of data, so you have to watch for the change in
> position after reading the key/value to capture a block transition.
> Ahad.
>
> On Thu, Oct 8, 2009 at 10:22 PM, Ahad Rana <ahad@commoncrawl.org> wrote:
>
>> Hi Ishwar,
>> You can implement a custom MapRunner and retrieve the position from the
>> reader before calling your map function. Be aware though, that for block
>> compressed files, the position returned represents block start position, not
>> the individual record position.
>>
>> Ahad.
>>
>>
>> On Thu, Oct 8, 2009 at 4:23 PM, ishwar ramani <rvmishwar@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I need to get the position of the key being processed in a mapper task.
>>> My inputFile is a sequence file ....
>>>
>>> I tried the Context, but the best i could get was the inputsplit
>>> position and the
>>> file name ....
>>>
>>>
>>> My other option is to start recording the pos in the key value while
>>> generating
>>> the sequence file.
>>> But that would mean rewriting all the files i already have :(
>>>
>>> any thoughts?
>>>
>>> ishwar
>>>
>>
>>
>

Mime
View raw message