hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <timrobertson...@gmail.com>
Subject Re: Object Serialization
Date Mon, 30 Nov 2009 12:44:09 GMT
I've been trying to think why it might work the second time and not the first.
I have a hunch that you are not handling your outputstreams carefully;
specifically, I wonder if you are flushing them or perhaps not closing
them.  If you are not closing them yourself, they will close in
Gerbage collection, and only flush then.  This might be why you
observe weird results.

Please try adding a call to flush() on any output streams before
returning, and of course, don't forget to close them in a

Just a hunch,


On Mon, Nov 30, 2009 at 1:30 PM, Edmund Kohlwey <ekohlwey@gmail.com> wrote:
> There should be no need to encode the object (I imagine you're passing it
> through the conf). Try (as Tim hints at below) saving the object to a flat
> file and using DistributedCache; pass the path to the serialized object file
> through the configuration instead.
> Although you "shouldn't" put small files in HDFS, for situations like this
> where you're distributing serialized objects, data sets, etc. it's
> acceptable. Just clean them up after you don't need them anymore.
> On 11/30/09 3:04 AM, Tim Robertson wrote:
>> How do you pass it to every map function?
>> Are you putting it in a DistributedCache and pulling it during the Map
>> configuration?
>> Are you running on a single machine?
>> Cheers
>> Tim
>> On Sun, Nov 29, 2009 at 11:10 PM,<aa225@buffalo.edu>  wrote:
>>> Hello Everybody,
>>>                 I have a question about object serialization in Hadoop.
>>> have
>>> an object A which I want to pass to every map function. Currently the
>>> code I am
>>> using for this is as under. The problem is if I run my program, the code
>>> crashes
>>> the first time with an error say that Java cannot deserialize the object
>>> list(
>>> but no error when java tries to serialize it ) and then when I run the
>>> program
>>> for the 2 time, without changing anything, the code works perfectly.
>>> I read on some blog post that the method I have used to serialize is not
>>> the
>>> ideal way. But this also does not explain the weird results I am getting.
>>>                 try
>>>                {
>>>                        ByteArrayOutputStream baos= new
>>> ByteArrayOutputStream();
>>>                        ObjectOutputStream oos= new
>>> ObjectOutputStream(baos);
>>>                        oos.writeObject(list);
>>>                        stock_list= encode.encode(baos.toByteArray());
>>>                }
>>>                catch(IOException e)
>>>                {
>>>                        e.printStackTrace();
>>>                }
>>> Thank You
>>> Abhishek Agrawal
>>> SUNY- Buffalo
>>> (716-435-7122)

View raw message