hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Best practices configuring libraries on the backend.
Date Wed, 28 Mar 2012 20:19:30 GMT
Hm. doesn't seem to work for me (with cdh3u3)
I defined

export HADOOP_TASKTRACKER_OPTS="-Djava.library.path=/usr/...."

and it doesn't seem to work (as opposed to when i set is with <final>
property mapred.child.java.opts on the data node).

Still puzzling.

On Tue, Mar 27, 2012 at 7:17 PM, George Datskos
<george.datskos@jp.fujitsu.com> wrote:
> Dmitriy,
>
> I just double-checked, and the caveat I stated earlier is incorrect.  So,
>  "-Djava.library.path" set in the client's {mapred.child.java.opts} should
> just append to to the "-Djava.library.path" that each TaskTracker has when
> creating the library path for each child (M/R) task.  So that's even better
> I guess.
>
>
> George
>
>
>
> On 2012/03/28 11:06, George Datskos wrote:
>>
>> Dmitriy,
>>
>> To deal with different servers having various shared libraries in
>> different locations, you can simply make sure the _TaskTracker_'s
>> -Djava.library.path is set correctly on each server.  That library path
>> should be passed along to each child (M/R) task.  (in *addition* to the
>> {mapred.child.java.opts} that you specify on the client-side configuration
>> options)
>>
>> One caveat: on the client-side, don't include "-Djava.library.path" or
>> that path will be passed along to all of the child tasks, overriding
>> site-specific one you set on the TaskTracker.
>>
>>
>> George
>>
>>
>> On 2012/03/28 10:43, Dmitriy Lyubimov wrote:
>>>
>>> Hello,
>>>
>>> I have a couple of questions regarding mapreduce configurations.
>>>
>>> We install various platforms on data nodes that require mixed set of
>>> native libraries.
>>>
>>> Part of the problem is that in general case, this software platforms
>>> may be installed into different locations in the backend. (we try to
>>> unify it, but still). What it means, it may require site-specific
>>> -Djava.library.path setting.
>>>
>>> I configured individual jvm options (mapred.child.java.opts) on each
>>> node to include specific set of paths. However, i encountered 2
>>> problems:
>>>
>>> #1: my setting doesn't go into effect unless I also declare it final
>>> in the data node. It's just being overriden by default -Xmx200 value
>>> from the driver  EVEN when i don't set it on the driver at all (and
>>> there seems to be no way to unset it).
>>>
>>> However, using "final" spec at the backend creates  a problem if some
>>> of numerous jobs we run wishes to override the setting still. The
>>> ideal behavior is if i don't set it in the driver, then backend value
>>> kicks in, otherwise it's driver's value. But i did not find a way to
>>> do that for this particular setting for some reason.Could somebody
>>> clarify the best workaround? thank you.
>>>
>>> #2. Ideal behavior would actually be to merge driver-specific and
>>> backend-specific settings. E.g. backend may need to configure specific
>>> software package locations while client may wish sometimes to set heap
>>> etc. Is there a best practice to achieve this effect?
>>>
>>> Thank you very much in advance.
>>> -Dmitriy
>>>
>>>
>>
>>
>>
>>
>
>

Mime
View raw message