hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <ja...@attributor.com>
Subject Re: JobConf: How to pass List/Map
Date Thu, 01 May 2008 20:46:42 GMT
We have been serializing to a bytearrayoutput stream then base64 
encoding the underlying byte array and passing that string in the conf.
It is ugly but it works well until 0.17

Enis Soztutar wrote:
> Yes Stringifier was committed in 0.17. What you can do in 0.16 is to 
> simulate DefaultStringifier. The key feature of the Stringifier is 
> that it can convert/restore any object to string using base64 encoding 
> on the binary form of the object. If your objects can be easily 
> converted to and from strings, then you can directly store them in 
> conf. The other obvious alternative would be to switch to 0.17, once 
> it is out.
>
> Tarandeep Singh wrote:
>> On Wed, Apr 30, 2008 at 5:11 AM, Enis Soztutar 
>> <enis.soz.nutch@gmail.com> wrote:
>>  
>>> Hi,
>>>
>>>  There are many ways which you can pass objects using configuration.
>>> Possibly the easiest way would be to use Stringifier interface.
>>>
>>>  you can for example :
>>>
>>>  DefaultStringifier.store(conf, variable ,"mykey");
>>>
>>>  variable = DefaultStringifier.load(conf, "mykey", variableClass );
>>>     
>>
>> thanks... but I am using Hadoop-0.16 and Stringifier is a fix for 
>> 0.17 version -
>> https://issues.apache.org/jira/browse/HADOOP-3048
>>
>> Any thoughts on how to do this in 0.16 version ?
>>
>> thanks,
>> Taran
>>
>>  
>>>  you should take into account that the variable you pass to 
>>> configuration
>>> should be serializable by the framework. That means it must implement
>>> Writable of Serializable interfaces. In your particular case, you 
>>> might want
>>> to look at ArrayWritable and MapWritable classes.
>>>
>>>  That said, you should however not pass large objects via 
>>> configuration,
>>> since it can seriously effect job overhead. If the data you want to 
>>> pass is
>>> large, then you should use other alternatives(such as DistributedCache,
>>> HDFS, etc).
>>>
>>>
>>>
>>>  Tarandeep Singh wrote:
>>>
>>>    
>>>> Hi,
>>>>
>>>> How can I set a list or map to JobConf that I can access in
>>>> Mapper/Reducer class ?
>>>> The get/setObject method from Configuration has been deprecated and
>>>> the documentation says -
>>>> "A side map of Configuration to Object should be used instead."
>>>> I could not follow this :(
>>>>
>>>> Can someone please explain to me how to do this ?
>>>>
>>>> Thanks,
>>>> Taran
>>>>
>>>>
>>>>
>>>>       
>>
>>   
>
-- 
Jason Venner
Attributor - Program the Web <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers and coding wizards, contact if 
interested

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message