hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Douglas <chri...@yahoo-inc.com>
Subject Re: aerialization.Deserializer.deserialize method help
Date Fri, 12 Sep 2008 22:01:23 GMT
Oh, I see what you mean. Yes, you need to reuse the objects that  
you're given in your deserializer.

This will change with HADOOP-1230, though. -C

On Sep 12, 2008, at 2:28 PM, Pete Wyckoff wrote:

>
> What I mean is let's say I plug in a deserializer that always  
> returns a new
> Object - in that case, since everything is pass by value, the new  
> object
> cannot make its way back to the SequenceFileRecordReader user.
>
> While(sequenceFileRecordReader.next(mykey, myvalue)) {
>  // do something
> }
>
> And then my deserializers one/both looks like:
>
> T deserialize(T obj) {
> // ignore obj
>  return new T(params);
> }
>
> Obj would be the key or the value passed in by the user, but since I  
> ignore
> it, basically what happens is the deserialized value actually gets  
> thrown
> away.
>
> More specifically, it gets thrown away in SequenceFile.Reader I  
> believe.
>
> -- pete
>
>
> On 9/12/08 2:20 PM, "Chris Douglas" <chrisdo@yahoo-inc.com> wrote:
>
>> If you pass in null to the deserializer, it creates a new instance  
>> and
>> returns it; passing in an instance reuses it.
>>
>> I don't understand the disconnect between Deserializer and the
>> RecordReader. Does your RecordReader generate instances that only
>> share a common subtype T? You need separate Deserializers for K and  
>> V,
>> if that's the issue... -C
>>
>> On Sep 12, 2008, at 2:01 PM, Pete Wyckoff wrote:
>>
>>>
>>> This method's signature is
>>> {code}
>>> T deserialize(T);
>>> {code}
>>>
>>> But, the RecordReader next method is
>>>
>>> {code}
>>> boolean next(K,V);
>>> {code}
>>>
>>> So, if the deserialize method does not return the same T (i.e., K or
>>> V), how
>>> would this new Object be propagated back thru the RecordReader next
>>> method.
>>>
>>> It seems the contract on the deserialize method is that it must
>>> return the
>>> same  T (although the javadocs say "may").
>>>
>>> Am I missing something? And if not, why isn't the API boolean
>>> deserialize(T)
>>> ?
>>>
>>> Thanks, pete
>>>
>>> Ps for things like Thrift, there's no way to re-use the object as
>>> there's no
>>> clear method, so if this is the case, I don't see how it would  
>>> work??
>>>
>>
>


Mime
View raw message