hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: remote job submission
Date Sat, 21 Apr 2012 06:14:56 GMT
Hi,

A JobClient is something that facilitates validating your job
configuration and shipping necessities to the cluster and notifying
the JobTracker of that new job. Afterwards, its responsibility may
merely be to monitor progress via reports from
JobTracker(MR1)/ApplicationMaster(MR2).

A client need not concern themselves, nor be aware about TaskTrackers
(or NodeManagers). These are non-permanent members of a cluster and do
not carry (critical) persistent states. The scheduling
of job and its tasks is taken care of from the JobTracker in MR1 (or
the MR Application's ApplicationMaster in MR2). The only thing a
JobClient running user needs to ensure is that he has access to the
NameNode (For creating staging files - job jar, job xml, etc.), the
DataNodes (for actually writing the previous files to DFS for the
JobTracker to pick up) and the JobTracker/Scheduler (for protocol
communication required to notify the cluster of a job and that its
resources are now ready to launch - and also monitoring progress)

On Sat, Apr 21, 2012 at 5:36 AM, JAX <jayunit100@gmail.com> wrote:
> RE anirunds question on "how to submit a job remotely".
>
> Here are my follow up questions - hope this helps to guide the discussion:
>
> 1) Normally - what is the "job client"? Do you guys typically use the namenode as the
client?
>
> 2) In the case where the client != name node ---- how does the client know how to start
up the task trackers ?
>
> UCHC
>
> On Apr 20, 2012, at 11:19 AM, Amith D K <amithdk@huawei.com> wrote:
>
>> I dont know your use case if its for test and
>> ssh across the machine are disabled then u write a script that can do ssh run the
jobs using cli for running your jobs. U can check ssh usage.
>>
>> Or else use Ooze
>> ________________________________________
>> From: Robert Evans [evans@yahoo-inc.com]
>> Sent: Friday, April 20, 2012 11:17 PM
>> To: common-user@hadoop.apache.org
>> Subject: Re: remote job submission
>>
>> You can use Oozie to do it.
>>
>>
>> On 4/20/12 8:45 AM, "Arindam Choudhury" <arindamchoudhury0@gmail.com> wrote:
>>
>> Sorry. But I can you give me a example.
>>
>> On Fri, Apr 20, 2012 at 3:08 PM, Harsh J <harsh@cloudera.com> wrote:
>>
>>> Arindam,
>>>
>>> If your machine can access the clusters' NN/JT/DN ports, then you can
>>> simply run your job from the machine itself.
>>>
>>> On Fri, Apr 20, 2012 at 6:31 PM, Arindam Choudhury
>>> <arindamchoudhury0@gmail.com> wrote:
>>>> "If you are allowed a remote connection to the cluster's service ports,
>>>> then you can directly submit your jobs from your local CLI. Just make
>>>> sure your local configuration points to the right locations."
>>>>
>>>> Can you elaborate in details please?
>>>>
>>>> On Fri, Apr 20, 2012 at 2:20 PM, Harsh J <harsh@cloudera.com> wrote:
>>>>
>>>>> If you are allowed a remote connection to the cluster's service ports,
>>>>> then you can directly submit your jobs from your local CLI. Just make
>>>>> sure your local configuration points to the right locations.
>>>>>
>>>>> Otherwise, perhaps you can choose to use Apache Oozie (Incubating)
>>>>> (http://incubator.apache.org/oozie/) It does provide a REST interface
>>>>> that launches jobs up for you over the supplied clusters, but its more
>>>>> oriented towards workflow management or perhaps HUE:
>>>>> https://github.com/cloudera/hue
>>>>>
>>>>> On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury
>>>>> <arindamchoudhury0@gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Do hadoop have any web service or other interface so I can submit
jobs
>>>>> from
>>>>>> remote machine?
>>>>>>
>>>>>> Thanks,
>>>>>> Arindam
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harsh J
>>>>>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>



-- 
Harsh J

Mime
View raw message