hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward Capriolo" <edlinuxg...@gmail.com>
Subject Re: Hadoop Distributed Virtualisation
Date Fri, 06 Jun 2008 16:41:18 GMT
I once asked a wise man in change of a rather large multi-datacenter
service, "Have you every considered virtualization?" He replied, "All
the CPU's here are pegged at 100%"

They may be applications for this type of processing. I have thought
about systems like this from time to time. This thinking goes in
circles. Hadoop is designed for storing and processing on different
hardware.  Virtualization lets you split a system into sub-systems.

Virtualization is great for proof of concept.
For example, I have deployed this: I installed VMware with two linux
systems on my windows host, I followed a hadoop multi-system-tutorial
running on two vmware nodes. I was able to get the word count
application working, I also confirmed that blocks were indeed being
stored on both virtual systems and that processing was being shared

The processing however was slow, of course this is the fault of
VMware. VMware has a very high emulation overhead. Xen has less
overhead. LinuxVserver and OpenVZ use software virtualization (they
have very little (almost no) overhead). Regardless of how much
overhead, overhead is overhead. Personally I find the Vmware falls
short of its promises

View raw message