hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno Mahé <bm...@apache.org>
Subject New vm available for Hadoop 0.20.205/0.22/0.23 branches
Date Thu, 15 Dec 2011 09:42:27 GMT

I am glad to announce the availability of a new deployment method for
Apache Hadoop and related projects in Apache Bigtop (incubating).
Anyone can now download an image for their favourite virtualization
technology with Apache Hadoop pre-installed and configured.

This may be useful in a lot of cases such as:
* You want to get familiar with Apache Hadoop or related projects
without worries regarding the setup
* You want to run some tests in a reproducible environment without being
afraid of breaking your system
* You want to develop and test your project against some specific
* You quickly want to see what is coming up in Apache Hadoop 0.23 branch
* You want to maintain some specific images at your cloud provider (ex:
AWS, ElasticHosts)
* You use an operating system which is not compatible with Apache Hadoop
but would like to write/test some jobs against Apache Hadoop or related
* You quickly want to get familiar with Apache Hadoop or related project
in a real distributed mode across multiple VMs

Currently Apache Bigtop only provides a base appliance with Apache
Hadoop running in pseudo-distributed mode on CentOS 6.
Several jobs have been created on our jenkins instance to generate
images based on the released version of Apache Hadoop 0.20.205 as well
as the in development branches of Apache Hadoop 0.22 (needs to update to
the released bits) and Apache Hadoop 0.23. Each of these jobs create
images for the following virtualization technology:
* KVM (libvirtd)
* Virtualbox
* VMware

Under the hood we use BoxGrinder (http://boxgrinder.org/), a fantastic
tool for generating all these VMs.
Currently it can creates VM for RHEL/CentOS/Fedora/ScientificLinux, but
could handle more OSes through plugins.
The list of supported virtualization technology (KVM, VirtualBox,
VMware, EC2) and delivery methods (local, sftp, s3, ebs, elastichosts,
and soon local/remote libvirtd) is also pretty large.
The BoxGrinder appliance definition format is also pretty simple to
understand and the current appliance should be easy to modify and
extend. There is also a convenient method to make an appliance inherits
from one or several other appliances.
This explains why the current appliance in Apache Bigtop is quite small,
so one could easily create an appliance with all the packages provided
by Apache Bigtop by just providing the additional packages.

If you wish to download the VMs built from Apache Bigtop (incubating)
jenkins instance, here are the links:
* Apache Bigtop 0.2.0 (incubating), includes Apache Hadoop 0.20.205:
  - KVM: http://bit.ly/tlPZCz
  - VMware: http://bit.ly/s1V42p
  - VirtualBox: http://bit.ly/s5vuj8

* Branch hadoop-0.22 in Apache Bigtop (incubating):
  - KVM: http://bit.ly/sMSupy
  - VMware: http://bit.ly/tSfN6E
  - VirtualBox: http://bit.ly/sc0wvL

* Branch hadoop-0.23 in Apache Bigtop (incubating):
  - KVM: http://bit.ly/sBdEVX
  - VMware: http://bit.ly/vj22mo
  - VirtualBox: http://bit.ly/ttwm5k

* The jenkins job creating these images is located there:

Once the chosen artefact is downloaded and expanded, you just need to
tell your virtualization tool to import an existing disk.
Be also careful regarding the RAM allocated to your VM. Apache Hadoop
0.23 has become quite memory hungry and will not be able to run the pi
example on a VM with 1024MB of RAM.

Please, don't hesitate to share your feedback, ideas or issues on Apache
Bigtop (incubating) mailing list or the ticket tracker.

Bruno Mahé

PS: I CCed the Apache Hadoop general since this email may interest a few
folks focused on that mailing list

View raw message