hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Cluster Machines
Date Fri, 06 Nov 2009 17:22:39 GMT
On Fri, Nov 6, 2009 at 11:53 AM, Allen Wittenauer
<awittenauer@linkedin.com> wrote:
>
>
>
> On 11/6/09 4:04 AM, "Steve Loughran" <stevel@apache.org> wrote:
>> Anyone been monitoring how much temp spaces the MR consumes, and got
>> some stats to share?
>
> It is so tied to how well tuned and the actual workload, that I'm not sure
> stats are all that useful.  I went with 100gb mainly because I knew that one
> of the biggest jobs at Yahoo! would consume that much.  In reality, LI could
> probably get by with 10x less, esp since we have 8 drives instead of Y!'s 4.
>
>
If you want to do a virtualized setup I suggest
http://www.linux-vserver.org/Welcome_to_Linux-VServer.org
Why?
Its is very close to a Solaris Zone in that unlike wm/vmware  linux
vserver is a "jail"  system with resource controls for CPU (timeslice)
/ memory etc. There is no emulation going on here, just a few more ops
in the patched kernel so its fast.

I use it extensively with hadoop and everything I do. In your case
since you want to duel purpose machines it can be very effective in
that you can run instances on the same machine and set resource
controls to make sure they do not trample each other.

Edward

Mime
View raw message