Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7BA23904D for ; Sat, 21 Apr 2012 06:15:50 +0000 (UTC) Received: (qmail 84106 invoked by uid 500); 21 Apr 2012 06:15:47 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 83617 invoked by uid 500); 21 Apr 2012 06:15:45 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 83591 invoked by uid 99); 21 Apr 2012 06:15:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Apr 2012 06:15:44 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qc0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Apr 2012 06:15:37 +0000 Received: by qcsd1 with SMTP id d1so7829350qcs.35 for ; Fri, 20 Apr 2012 23:15:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=4/8g7jR8CKF8lcOsOQCeEN92mqyaTBrOnT2iZWqpPwg=; b=EYaLcWxebVS6BzU3utRKsvBqPVT1GPe/FLbQT8z2kLkeL4OexDBokVVUdL1uFiuof/ pzsHWWso7MY1kymJif12YKWTLp5HHqtTvfC22wvwOvuNowF84rAV3z/KEqFHrXk4aWaB ZYxs1RsOjdZIi3hEkpdJLNsrS8Rr4J6BHDoPWzWnM4fO2MOu+M6mWIEdIJ1klgB/ac3O aLZ45kCQdrl51XS1XPHCaCXqmY4MhqS5kXjPCm53wIeSqblmbo9FqJs8ZuTW73UdNrqc kiJ7QFexsnEnV1mLMASMjdxIJzVR9szRLGuEDmhqlPMryDAiy5JHEEtvSjhg5kvZmbwk b/+g== Received: by 10.224.31.204 with SMTP id z12mr8889996qac.89.1334988916687; Fri, 20 Apr 2012 23:15:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.55.134 with HTTP; Fri, 20 Apr 2012 23:14:56 -0700 (PDT) In-Reply-To: References: <9EDAA42124A2D6419DFB2F958B24DC5E220C892F@szxeml532-mbx.china.huawei.com> From: Harsh J Date: Sat, 21 Apr 2012 11:44:56 +0530 Message-ID: Subject: Re: remote job submission To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQnRgHwGVpb0qdLQg41tW3RAHpdfOUzdxWF5SjWu+bJvxdWATLXHOMq7HUfGOyuM+F/TLZA1 Hi, A JobClient is something that facilitates validating your job configuration and shipping necessities to the cluster and notifying the JobTracker of that new job. Afterwards, its responsibility may merely be to monitor progress via reports from JobTracker(MR1)/ApplicationMaster(MR2). A client need not concern themselves, nor be aware about TaskTrackers (or NodeManagers). These are non-permanent members of a cluster and do not carry (critical) persistent states. The scheduling of job and its tasks is taken care of from the JobTracker in MR1 (or the MR Application's ApplicationMaster in MR2). The only thing a JobClient running user needs to ensure is that he has access to the NameNode (For creating staging files - job jar, job xml, etc.), the DataNodes (for actually writing the previous files to DFS for the JobTracker to pick up) and the JobTracker/Scheduler (for protocol communication required to notify the cluster of a job and that its resources are now ready to launch - and also monitoring progress) On Sat, Apr 21, 2012 at 5:36 AM, JAX wrote: > RE anirunds question on "how to submit a job remotely". > > Here are my follow up questions - hope this helps to guide the discussion: > > 1) Normally - what is the "job client"? Do you guys typically use the namenode as the client? > > 2) In the case where the client != name node ---- how does the client know how to start up the task trackers ? > > UCHC > > On Apr 20, 2012, at 11:19 AM, Amith D K wrote: > >> I dont know your use case if its for test and >> ssh across the machine are disabled then u write a script that can do ssh run the jobs using cli for running your jobs. U can check ssh usage. >> >> Or else use Ooze >> ________________________________________ >> From: Robert Evans [evans@yahoo-inc.com] >> Sent: Friday, April 20, 2012 11:17 PM >> To: common-user@hadoop.apache.org >> Subject: Re: remote job submission >> >> You can use Oozie to do it. >> >> >> On 4/20/12 8:45 AM, "Arindam Choudhury" wrote: >> >> Sorry. But I can you give me a example. >> >> On Fri, Apr 20, 2012 at 3:08 PM, Harsh J wrote: >> >>> Arindam, >>> >>> If your machine can access the clusters' NN/JT/DN ports, then you can >>> simply run your job from the machine itself. >>> >>> On Fri, Apr 20, 2012 at 6:31 PM, Arindam Choudhury >>> wrote: >>>> "If you are allowed a remote connection to the cluster's service ports, >>>> then you can directly submit your jobs from your local CLI. Just make >>>> sure your local configuration points to the right locations." >>>> >>>> Can you elaborate in details please? >>>> >>>> On Fri, Apr 20, 2012 at 2:20 PM, Harsh J wrote: >>>> >>>>> If you are allowed a remote connection to the cluster's service ports, >>>>> then you can directly submit your jobs from your local CLI. Just make >>>>> sure your local configuration points to the right locations. >>>>> >>>>> Otherwise, perhaps you can choose to use Apache Oozie (Incubating) >>>>> (http://incubator.apache.org/oozie/) It does provide a REST interface >>>>> that launches jobs up for you over the supplied clusters, but its more >>>>> oriented towards workflow management or perhaps HUE: >>>>> https://github.com/cloudera/hue >>>>> >>>>> On Fri, Apr 20, 2012 at 5:37 PM, Arindam Choudhury >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> Do hadoop have any web service or other interface so I can submit jobs >>>>> from >>>>>> remote machine? >>>>>> >>>>>> Thanks, >>>>>> Arindam >>>>> >>>>> >>>>> >>>>> -- >>>>> Harsh J >>>>> >>> >>> >>> >>> -- >>> Harsh J >>> >> -- Harsh J