hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: virtualization with hadoop
Date Thu, 26 Mar 2009 15:23:37 GMT
I use linux-vserver http://linux-vserver.org/

The Linux-VServer technology is a soft partitioning concept based on
Security Contexts which permits the creation of many independent
Virtual Private Servers (VPS) that run simultaneously on a single
physical server at full speed, efficiently sharing hardware resources.

Usually whenever people talk about virtual machines, I always here
about VMware, Xen, QEMU. For MY purposes Linux Vserver is far superior
to all of them and its very helpful for the hadoop work I do. (I only
want linux guests)

No emulation overhead - I installed VMWare server on my laptop and was
able to get 3 linux instances running before the system was unusable,
the instances were not even doing anything.

With VServer my system is not wasting cycles emulating devices. VMs
are securely sharing a kernel and memory. You can effectively run many
more VMs at once. This leaves the processor for user processes
(hadoop) not emulation overhear.

A minimal installation is 50 MB. I do not need a multi GB Linux
install just to test a version of hadoop. This allows me to recklessly
make VMs for whatever I want and not have to worry about GB chunks of
my hard drive going with each VM.

I can tar up a VM and use it as a template to install another VM. Thus
I can deploy a new system in under 30 seconds. The HTTP RPM install
takes about 2 minutes.

The guest is chroot 'ed. I can easily copy files into the guest using
copy commands. Think ant deploy -DTARGETDIR=/path/to/guest.

>>But it is horrible slow if you not have enough ram and multiple
>>disks since all I/o-Operations go to the same disk.

VServer will not solve this problem, but at least you want be losing
IO to 'emulation'.

If you are working with hadoop and you need to be able to have
multiple versions running, with different configurations, take a look
at VServer.

View raw message