oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wong, Cynthia L (388J)" <cynthia.l.w...@jpl.nasa.gov>
Subject Re: Capacity vs Load in Resource Manager
Date Tue, 17 Apr 2012 14:27:16 GMT

Thank you for your replies. I'll check these out and may bring in more


Cynthia Wong--
Cynthia L. Wong
Data Management Systems and Technologies
Jet Propulsion Laboratory
4800 Oak Grove Drive, M/S  171-264, Pasadena, CA  91109-8099
Phone:  818/393-2572, Email: Cynthia.L.Wong@jpl.nasa.gov

On 4/16/12 8:27 PM, "Mattmann, Chris A (388J)"
<chris.a.mattmann@jpl.nasa.gov> wrote:

>Hi Gabe,
>On Apr 16, 2012, at 11:44 AM, Resneck, Gabriel M (388J) wrote:
>> To use Chris's words, when using the "fresh-out-of-the-box" version of
>>the RM, both of the concepts of Capacity and Load are entirely
>I'd clarify that while the default values set for these concepts are
>arbitrary, the concepts themselves are not. Capacity is used
>by the AssignmentMonitor and is a core property of the ResourceNode
>class. Load, is leveraged by the AssignmentMonitor
>to determine the current business of one of the ResourceNodes.
>> They have no relation to any kind of resources available on your node
>Well, again, the default out of the box values for these concepts don't,
>but the concepts themselves do.
>> Therefore, if you give each job a load of 1 (regardless of the node
>>resources required to run the job) and if you give a node a capacity of
>>10, the RM will try to always have 10 jobs running on that node.
>>  It does nothing to track resource usage on the node, so use of such a
>>paradigm as the one that I just described could be wildly inefficient.
>Let's clarify that again. Saying it *does nothing* kind of doesn't sound
>right to me. It *does* do something. It tracks how
>much load is currently on a node, compared to its current capacity, and
>provides that information as-is to the Scheduler,
>which then in turn uses the information to determine a node "besting"
>algorithm to determine what node to select to
>Batch a job out to. So, it does *do something*. It's just that it's not
>real-time and more virtual profiling. And, let's be specific.
>The XMLAssignmentMonitor decides how this information will be used and
>provided and tracked. This is just one
>potential implementation of the AssignmentMonitor RM extension point.
>We could (and should) develop a Ganglia resource monitor that could
>leverage Ganglia information to plug in. And
>we could develop a TorqueAssignmentMonitor that uses qmon or something
>like it to parse the information out of
>Torque's queue. We could also connect in to Sun Grid Engine (SGE) or
>another DRM technology to get this
>information too.
>> Because these numbers are arbitrary, I recommend carefully
>>investigating the availability of resources on your nodes and setting
>>load and capacity levels using that information.  For example, if you
>>find that your jobs tend to be I/O bound when you have more than 3
>>running simultaneously on the same node, then you could set your job
>>load to 1 and the node capacity to 3.  If you wanted more granularity,
>>you could easily set the load to 33 and the capacity to 100.  Since
>>these numbers are entirely arbitrary, you have the freedom to make such
>>changes.  Obviously, not all jobs will be the same, so you may want to
>>assign different loads to different jobs and assign different capacities
>>to nodes based upon the resources that each makes available.
>Exactly. And to add to that, you can group different jobs into different
>queues, and then queues to nodes, to control flow of jobs
>onto those nodes, based on a "queue type".
>Chris Mattmann, Ph.D.
>Senior Computer Scientist
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 171-266B, Mailstop: 171-246
>Email: chris.a.mattmann@nasa.gov
>WWW:   http://sunset.usc.edu/~mattmann/
>Adjunct Assistant Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA

View raw message