Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 26897F3CD for ; Mon, 25 Mar 2013 18:15:21 +0000 (UTC) Received: (qmail 18956 invoked by uid 500); 25 Mar 2013 18:15:16 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 18806 invoked by uid 500); 25 Mar 2013 18:15:16 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 18797 invoked by uid 99); 25 Mar 2013 18:15:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Mar 2013 18:15:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of junk@gmail.com designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Mar 2013 18:15:10 +0000 Received: by mail-wi0-f177.google.com with SMTP id hm14so7043273wib.16 for ; Mon, 25 Mar 2013 11:14:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:reply-to:date:message-id:subject:from:to :content-type; bh=a3ny9U3oXLG0cCg6DIK7N0cD7396bf1zbwbq7UxGoaA=; b=V3bmmnLuiP4C21G57ermreo44Iu7xZW3JwzaUq8IyspGVAqSp+nd6+LT0FhqVOH6zQ Z3rhHAHx1K0x4i4phuFU9HJB1n4pKpfc6jAdjFiCyV7ajCCkcK46WrmDrFY1CRqtcZ8u 6Lh0x9YJOsaqBsmb5VbtOryaCz/VnJ2I+Y6KOUt4T5tiJXPw+3cbaNQYjIba7ABv/7RR LbQHemOdelotiGtZ/hStfUC4egaK4LPsPi8rivt0KhWD5RupFfoNiJv916WkQJK1b1G7 rwEQCoSzzik/i7/AVeBTxQXp9FkxAcFKpgusoEhiBxPhJwHqBf2/vHgo9MmSNjC9/i3V Plkw== MIME-Version: 1.0 X-Received: by 10.180.84.8 with SMTP id u8mr27468769wiy.1.1364235288851; Mon, 25 Mar 2013 11:14:48 -0700 (PDT) Received: by 10.194.32.234 with HTTP; Mon, 25 Mar 2013 11:14:48 -0700 (PDT) Reply-To: junk@gmail.com Date: Mon, 25 Mar 2013 11:14:48 -0700 Message-ID: Subject: Node manager health checker opts From: Tucker To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d0418253cc83e9904d8c3c712 X-Virus-Checked: Checked by ClamAV on apache.org --f46d0418253cc83e9904d8c3c712 Content-Type: text/plain; charset=ISO-8859-1 Does anyone have a working example of a node manager health checker scipt using "yarn.nodemanager.health-checker.script.opts"? I wrote a health checker that works fine but one of the items being checked is a little too sensitive. Since I wrote it to be able to load and unload modules by passing various flags. Unfortunately, adding these flags to my config doesn't seem to have had any affect and we've had to disable the health check entirely. For reference: $ health_checker -h Usage: health_checker [options] --default-disabled Default all checks disabled. -e, --enable-checks CHECKS Command separated list of checks to enable. -d, --disable-checks CHECKS Command separated list of checks to disable. -l, --list List available checks. Settings used: yarn.nodemanager.health-checker.script.path /usr/bin/health_checker ... yarn.nodemanager.health-checker.script.opts -d Network If the flag were actually being passed, I would expect the output to be return healthy. This is what I see on a command line: # health_checker ERROR(s): ["Errors found on interface eth2."] # health_checker -d Network Healthy # echo $? 0 Unfortunately, even with opts set, I continue to get the interface errors warning after cluster start and beyond the run interval. I assume I'm missing something but I can't seem to find any good docs on the matter. -- --tucker --f46d0418253cc83e9904d8c3c712 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Does anyone have a working example of a node manager healt= h checker scipt using "yarn.nodemanager.health-checker.script.opts&quo= t;? =A0I wrote a health checker that works fine but one of the items being = checked is a little too sensitive. =A0Since I wrote it to be able to load a= nd unload modules by passing various flags. =A0Unfortunately, adding these = flags to my config doesn't seem to have had any affect and we've ha= d to disable the health check entirely.

For reference:

$ health_checker -h
Usage: health_checker [opt= ions]
=A0 =A0 =A0 =A0 --default-disabled =A0 =A0 =A0 =A0 =A0 Default all= checks disabled.
=A0 =A0 -e, --enable-checks CHECKS =A0 =A0 =A0 Command= separated list of checks to enable.
=A0 =A0 -d, --disable-checks CHECKS =A0 =A0 =A0Command separated list of ch= ecks to disable.
=A0 =A0 -l, --list =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 List available checks.

Settings used:

<property>= ;
<name>yarn.nodemanager.health-checker.script.path</name> <value>/usr/bin/health_checker</value>
</property>
= ...
<property>
<name>yarn.nodemanager.health-checker.scri= pt.opts</name>
<value>-d Network</value>
</prope= rty>

If the flag were actually being passed, I would expect the output to be= return healthy. =A0This is what I see on a command line:

# health_c= hecker
ERROR(s): ["Errors found on interface eth2."]
# hea= lth_checker -d Network
Healthy
# echo $?
0

Unfortunately, eve= n with opts set, I continue to get the interface errors warning after clust= er start and beyond the run interval. =A0I assume I'm missing something= but I can't seem to find any good docs on the matter.

--

--tucker
--f46d0418253cc83e9904d8c3c712--