hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "TestingNov2009" by SteveLoughran
Date Fri, 20 Nov 2009 15:16:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "TestingNov2009" page has been changed by SteveLoughran.
The comment on this change is: problem of IaaS handing back unresponsive machines..
http://wiki.apache.org/hadoop/TestingNov2009?action=diff&rev1=2&rev2=3

--------------------------------------------------

   * Testing on EC2 runs up rapid bills if you create/destroy machines every junit test method,
or even every test run. Best to create a small pool of machines at the start of the working
day, release them in the evening. And to have build file targets to destroy all of a developer's
machines -and to run it at night as part of the CI build.
   * Troubleshooting on IaaS platforms can be interesting as the VMs get destroyed -the test
runner needs to capture (relevant) local log data.
   * SSH is the primary way to communicate with the (long-haul) cluster, even from a developer's
local machine.
-  * Important not to embed private data -keys, logins, in build files, test reports-
+  * Important not to embed private data -keys, logins, in build files, test reports or disk
images
+  * Sometimes on EC2, and more often on smaller/unstable clusters, the allocated machines
don't come up or "aren't right". You need to do early health checks on the machines, and if
they are unwell, release and reallocate them.
   * For testing local Hadoop builds on IaaS platforms, the build process needs to scp over
and install the Hadoop binaries and the configuration files. This can be done by creating
a new disk image that is then used to bootstrap every node, or you start with a base clean
image and copy in Hadoop on demand. The latter is much more agile and cost effective during
iterative development, but doesn't scale to very-large clusters (1000s of machines), unless
you delegate the task of copy/install to the first few tens of allocated machines. For EC2,
one tactic is to upload the binaries to S3, and have scripts on the nodes to copy down and
install the files.
  
  

Mime
View raw message