Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 68A85C228 for ; Sat, 6 Jul 2013 14:12:35 +0000 (UTC) Received: (qmail 98898 invoked by uid 500); 6 Jul 2013 14:12:30 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 98742 invoked by uid 500); 6 Jul 2013 14:12:30 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 98735 invoked by uid 99); 6 Jul 2013 14:12:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Jul 2013 14:12:30 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of amits@infolinks.com designates 207.126.144.115 as permitted sender) Received: from [207.126.144.115] (HELO eu1sys200aog103.obsmtp.com) (207.126.144.115) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 06 Jul 2013 14:12:23 +0000 Received: from mail-ie0-f180.google.com ([209.85.223.180]) (using TLSv1) by eu1sys200aob103.postini.com ([207.126.147.11]) with SMTP ID DSNKUdglskhET9t54pmoswyl2s2XDteSzY9o@postini.com; Sat, 06 Jul 2013 14:12:03 UTC Received: by mail-ie0-f180.google.com with SMTP id f4so7057024iea.39 for ; Sat, 06 Jul 2013 07:12:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :x-gm-message-state; bh=iKSN/s6jbsH5c3g91FTpl63SKnFJuhrJG/6InXXvKrs=; b=b1YhNYK6CmzRxmG84R2mJWTcwiPWiiL9ViisS80tcBCF3LdpvWVIXZlux+9xLgrrRI hAphLXIGdzq5Q1p+NLAe55CNkTxfwXOT6LLjXs1B8DNi8By+adn3B5Ujz3R0q4bUIeP3 P7ESwbCT2JEysr6OErPQOBIffb3ZqX2rK0/zx/86j3HI7ZRKhgiS1kY1jJpJqwLqlVYo nKmfHgESFR1ptMmE9TpmJMgeuWmyz4qpASJgCTSI2o9dcpgxjBiF0GLvy4ayd/o/cpDb OKkYDeWSdoXjuUWeCridoIG47pdbWoFWBn6k9DjqueukhnB4XIbQK9YJzdGCTg9O/tXH Gtzw== X-Received: by 10.50.136.196 with SMTP id qc4mr30381160igb.21.1373119921271; Sat, 06 Jul 2013 07:12:01 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.50.136.196 with SMTP id qc4mr30381159igb.21.1373119921216; Sat, 06 Jul 2013 07:12:01 -0700 (PDT) Received: by 10.64.230.74 with HTTP; Sat, 6 Jul 2013 07:12:01 -0700 (PDT) Date: Sat, 6 Jul 2013 17:12:01 +0300 Message-ID: Subject: Using CapacityScheduler to divide resources between jobs (not users) From: Amit Sela To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=089e01494eee23716904e0d865d4 X-Gm-Message-State: ALoCoQn6XcxkV1peaOlnVu1OTwbl9dKrJWMevCko6oQ5gj484INXYtu1+3FzCC2XDRTrOoxHBdE77oD5d1kw559ZRgLjZmV5IR4nsbS4lVdN3bRXDZ09ZxfsvSFQGCWnkecmZ6gZAr8GYuLMofnZGiDH4jK0F1kEr9ZWof9eD+zpTkbAQTCpm04= X-Virus-Checked: Checked by ClamAV on apache.org --089e01494eee23716904e0d865d4 Content-Type: text/plain; charset=ISO-8859-1 Hi all, I'm running Hadoop 1.0.4 on a modest cluster (~20 machines). The jobs running on the cluster can be divided (resource wise) as follows: 1. Very short jobs: less then 1 minute. 2. Normal jobs: 2-3 minutes up to an hour or two. 3. Very long jobs: days of processing. (still not active and the reason for my inquiries here). I was thinking of using the CapacityScheduler and divide the cluster resources so that the long jobs can run without disturbing the other jobs. I read that such job queues should be upper bound as well since it may use the entire cluster resources once it's free but since it takes a long time to finish, it won't release them to other queues as it should. Is it so ? Any advise about using the CapacityScheduler in that use case ? Thanks, and sorry for re-sending this message. Amit. --089e01494eee23716904e0d865d4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi all,=A0

I'm running H= adoop 1.0.4 on a modest cluster (~20 machines).
The jobs running = on the cluster can be divided (resource wise) as follows:

1. Very short jobs: less then 1 minute.
2. Normal jobs: 2-3 = minutes up to an hour or two.
3. Very long jobs: days of processi= ng. (still not active and the reason for my inquiries here).

I was thinking of using the CapacityScheduler and divide the= cluster resources so that the long jobs can run without disturbing the oth= er jobs.
I read that such job queues should be upper bound = as well since it may use the entire cluster resources once it's free bu= t since it takes a long time to finish, it won't release them to other = queues as it should. Is it so ?
Any advise about using the=A0CapacityScheduler in that use case = ?

Thanks, and sorry for re-sending thi= s message.

Amit.
--089e01494eee23716904e0d865d4--