hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark question <markq2...@gmail.com>
Subject Re: SequenceFile.Reader
Date Fri, 03 Jun 2011 05:32:45 GMT
Actually, I checked the source code of Reader and it turns it reads the
value into a buffer but only returns the key to the user :(  how is this
different than :

Writable value = new Writable();

reader.next(key,value) !!! both are using the same object for multiple
reads. I was hoping next(key) would skip reading value from disk.

Mark

On Thu, Jun 2, 2011 at 6:20 PM, Mark question <markq2011@gmail.com> wrote:

> Hi John, thanks for the reply. But I'm not asking about the key memory
> allocation here. I'm just saying what's the difference between:
>
> Next(key,value) and Next(key) .  Is the later one still reading the value
> of the key to reach the next key? or does it read the key then using the
> recordSize skips to the next key?
>
> Thanks,
> Mark
>
>
>
>
> On Thu, Jun 2, 2011 at 3:49 PM, John Armstrong <john.armstrong@ccri.com>wrote:
>
>> On Thu, 2 Jun 2011 15:43:37 -0700, Mark question <markq2011@gmail.com>
>> wrote:
>> >  Does anyone knows if :  SequenceFile.next(key) is actually not reading
>> > value into memory????
>>
>> I think what you're confused by is something I stumbled upon quite by
>> accident.  The secret is that there is actually only ONE Key object that
>> the RecordReader presents to you.  The next() method doesn't create a new
>> Key object (containing the new data) but actually just loads the new data
>> into the existing Key object.
>>
>> The only place I've seen that you absolutely must remember these unusual
>> semantics is when you're trying to copy keys or values for some reason, or
>> to iterate over the Iterable of values more than once.  In these cases you
>> must make defensive copies because otherwise you'll just git a big list of
>> copies of the same Key, containing the last Key data you saw.
>>
>> hth
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message