Return-Path: X-Original-To: apmail-oodt-dev-archive@www.apache.org Delivered-To: apmail-oodt-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B1FA983B for ; Sat, 14 Apr 2012 20:06:46 +0000 (UTC) Received: (qmail 37263 invoked by uid 500); 14 Apr 2012 20:06:46 -0000 Delivered-To: apmail-oodt-dev-archive@oodt.apache.org Received: (qmail 37227 invoked by uid 500); 14 Apr 2012 20:06:46 -0000 Mailing-List: contact dev-help@oodt.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@oodt.apache.org Delivered-To: mailing list dev@oodt.apache.org Received: (qmail 37218 invoked by uid 99); 14 Apr 2012 20:06:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Apr 2012 20:06:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of brbarkstrom@gmail.com designates 209.85.220.171 as permitted sender) Received: from [209.85.220.171] (HELO mail-vx0-f171.google.com) (209.85.220.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Apr 2012 20:06:38 +0000 Received: by vcbfl15 with SMTP id fl15so6447216vcb.16 for ; Sat, 14 Apr 2012 13:06:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=kJN1Oep1+Z4bczYhgnflbxA3tDvkvfH7XQHEXUBqYZ4=; b=dISP5NJreSnJKkvuGskuAW7as1JMmh+/l0Li1fBG9VjUXB+1kYeKA6Oyo56VKmHqiP /JuJH1iKiuPnA5WIP8zg+QUoaEdFE3tx+pnd2TAt9KHjKDohEswjizK1MWJjuua8Dfcg H/vmthZ3ZZluShZpvWjFce+FZSvysdR6qHvpGYKwzeRjV0QfUalt9ab9UmbQ0hVpLh/s 3nfk53Jw7qJsX/HL0FiJdKmbHNzf7rMt0fdExjLIGtOMpzwfIPAlmtOBTpLPnFcMpMIj BzTT3grnj3fU5SZNRluJcQ3pYZz1l9G1YP5h4nrw4R2gV9vLdBLzKzXzacd7BKy8GghL ZAbw== MIME-Version: 1.0 Received: by 10.52.178.135 with SMTP id cy7mr2542730vdc.77.1334433977475; Sat, 14 Apr 2012 13:06:17 -0700 (PDT) Received: by 10.220.151.16 with HTTP; Sat, 14 Apr 2012 13:06:17 -0700 (PDT) Date: Sat, 14 Apr 2012 16:06:17 -0400 Message-ID: Subject: Capacity From: Bruce Barkstrom To: dev@oodt.apache.org Content-Type: text/plain; charset=ISO-8859-1 Defining "capacity" is not an easy thing to do. The books that David Patterson and colleagues have written on the computer design show that none of the usual metrics, like MFLOPS, MIPS, and so on have any sort of well-founded theoretical basis. I tried using Dongerra's MFLOPS database and found the most sensible regression was just the clock speed of the processor - but that was before multi-core machines came along and folks avoided increasing the clock speed to avoid power consumption. The most sensible production metric that I can think of is the wall clock time from job inception to completion - probably including the time to stage data to someplace useful and then the time to move it from the computation back to a sensible storage spot. That metric at least fits into large scale scheduling approaches (and looks like a Gantt chart activity). DOE has apparently been trying to simulate actual computational loads for some supercomputer simulations - but they don't try to apply their simulations to the entire runs they're going to make - even they don't have the computer power for that. I've also got a book on manufacturing systems engineering that says capacity is a stochastic property of systems. I'll even note that schedules can have a structurally stochastic behavior (as in "I didn't know the machine was going to catch on fire - and it burned the whole factory - so now what do we do?") Thus, the key guidance is - keep the metric simple. If you want solid numerical values, you'll have to run an experiment on how long it will take to run a job. I should perhaps note that there are similar pleasantries on trying to do network capacity estimates. One of the typical approaches to dealing with capacities is to do a queuing theory model. If I recall, the time for delivery of a file over the Internet has a very long tailed distribution for completion. An interested party could probably build a probability distribution and then use occasional tests to update the distribution. However, simple math with a few parameters it's not. Bruce R. Barkstrom