hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Patton" <tpat...@dealcatcher.com>
Subject Load Balancing?
Date Fri, 24 Feb 2006 20:10:32 GMT
Sorry to post this on the Dev list, but the user list doesn't seem to get
any traffic.  I've been going through the code trying to figure out how
Hadoop would load balance among machines.  Specifically, if I had two types
of tasks, one high CPU and one low CPU, how can I make sure machines aren't
getting too bogged down by being assigned too many high CPU  tasks or aren't
sitting idle when they could be running more low CPU tasks?  I suppose the
same question could be asked about high and low RAM usage as well.   From
looking at the Nutch source code, it appears it fetches/processes first,
then indexes, when it could be done in parallel.  Is this why?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message