Return-Path: Delivered-To: apmail-hadoop-core-commits-archive@www.apache.org Received: (qmail 11497 invoked from network); 17 May 2009 12:44:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 May 2009 12:44:37 -0000 Received: (qmail 67066 invoked by uid 500); 17 May 2009 12:44:37 -0000 Delivered-To: apmail-hadoop-core-commits-archive@hadoop.apache.org Received: (qmail 66981 invoked by uid 500); 17 May 2009 12:44:37 -0000 Mailing-List: contact core-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-commits@hadoop.apache.org Received: (qmail 66971 invoked by uid 99); 17 May 2009 12:44:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 May 2009 12:44:37 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 May 2009 12:44:35 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id D8D8E118BA for ; Sun, 17 May 2009 12:44:14 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: core-commits@hadoop.apache.org Date: Sun, 17 May 2009 12:44:14 -0000 Message-ID: <20090517124414.10568.23027@eos.apache.org> Subject: [Hadoop Wiki] Trivial Update of "AmazonEC2" by JoydeepSensarma X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The following page has been changed by JoydeepSensarma: http://wiki.apache.org/hadoop/AmazonEC2 The comment on the change is: anchor for remote machine section ------------------------------------------------------------------------------ }}} * Keep in mind that the master node is started first and configured, then all slaves nodes are booted simultaneously with boot parameters pointing to the master node. Even though the `lauch-cluster` command has returned, the whole cluster may not have yet 'booted'. You should monitor the cluster via port 50030 to make sure all nodes are up. + [[Anchor(#FromRemoteMachine)]] === Running a job on a cluster from a remote machine (0.17+) === In some cases it's desirable to be able to submit a job to a hadoop cluster running in EC2 from a machine that's outside EC2 (for example a personal workstation). Similarly - it's convenient to be able to browse/cat files in HDFS from a remote machine. One of the advantages of this setup is that it obviates the need to create custom AMIs that bundle stock Hadoop AMIs and user libraries/code. All the non-Hadoop code can be kept on the remote machine and can be made available to Hadoop during job submission time (in the form of jar files and other files that are copied into Hadoop's distributed cache). The only downside being the [http://aws.amazon.com/ec2/#pricing cost of copying these data sets] into EC2 and the latency involved in doing so.