Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4E00918578 for ; Thu, 4 Jun 2015 06:20:08 +0000 (UTC) Received: (qmail 68678 invoked by uid 500); 4 Jun 2015 06:20:03 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 68576 invoked by uid 500); 4 Jun 2015 06:20:03 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 68566 invoked by uid 99); 4 Jun 2015 06:20:03 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jun 2015 06:20:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id CC386CB404 for ; Thu, 4 Jun 2015 06:20:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.969 X-Spam-Level: ** X-Spam-Status: No, score=2.969 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id XnpnVl8wy_HY for ; Thu, 4 Jun 2015 06:20:00 +0000 (UTC) Received: from DUB004-OMC2S32.hotmail.com (dub004-omc2s32.hotmail.com [157.55.1.171]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 242E22315E for ; Thu, 4 Jun 2015 06:20:01 +0000 (UTC) Received: from DUB130-W15 ([157.55.1.137]) by DUB004-OMC2S32.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.22751); Wed, 3 Jun 2015 23:20:00 -0700 X-TMN: [mg8OmkTgBt99z2goQzVoN+cLIch6RXp7] X-Originating-Email: [yves_callaert@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_3219edaf-3ee2-4ea3-a585-6517c7d91c68_" From: yves callaert To: "user@hadoop.apache.org" Subject: RE: Monitoring dashboard for Hadoop? Date: Thu, 4 Jun 2015 06:20:00 +0000 Importance: Normal In-Reply-To: <040a01d09e43$d95554f0$8bfffed0$@mac.com> References: <040a01d09e43$d95554f0$8bfffed0$@mac.com> MIME-Version: 1.0 X-OriginalArrivalTime: 04 Jun 2015 06:20:00.0939 (UTC) FILETIME=[7CC22FB0:01D09E8E] --_3219edaf-3ee2-4ea3-a585-6517c7d91c68_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Hi=2C Depending on the version you are using there are some ways to monitor jobs. You can use Hue (cloudera technology) which has a job monitoring system=2C = but you could also use the "Yarn Resource Manager UI" to follow jobs. Monitoring of nodes can be done through ambari(https://ambari.apache.org/) = or Cloudera Manager (only available for cloudera distributions). As far as I know the replication process for HDFS can not be changed to fav= our nodes. An even distribution is needed in order to have an evenly spreaded load. If replication blocks get corrupted this will be visible in the logs but th= e namenode will auto correct the problem by creating a new version of the b= lock. Normally you will have a replication factor of 3=2C but you can change this= =2C if you want data to be spread across more nodes. Hope this answers some questions. With Regards=2C Yves From: caesarsamsi@mac.com To: user@hadoop.apache.org Subject: Monitoring dashboard for Hadoop? Date: Wed=2C 3 Jun 2015 17:25:43 -0400 Hello=2C I=92m new to Hadoop and successfully built a fully distributed clu= ster of 3 nodes (1 master=2C 2 slaves) as a proof of concept. I have some q= uestions below. Is there a dashboard to monitor the progress of a mapreduce= computation? 1. I=92m looking to ensure the computation gets allocat= ed and uses the correct number of computation nodes2. Monitor computa= tion on the nodes (up/down/in-progress/completed)3. If possible direc= t computation to specific group of nodes (depending on the computation prio= rity). Similarly for HDFS1. Ensure data file gets replicated to the c= orrect number of nodes2. If possible prioritize data replication (i.e= . replicate data files that are accessed frequently to nodes that have bett= er hardware=2C so some sort of load balancing distribution) Many Thanks=2C = Caesar. = --_3219edaf-3ee2-4ea3-a585-6517c7d91c68_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
Hi=2C
Depending on the versio= n you are using there are some ways to monitor jobs.
You can use Hue (cl= oudera technology) which has a job monitoring system=2C but you could also = use the "Yarn Resource Manager UI" to follow jobs.

Monitoring of nod= es can be done through ambari(https://ambari.apache.org/) or Cloudera Manag= er (only available for cloudera distributions).

As far as I know the= replication process for HDFS can not be changed to favour nodes.
An eve= n distribution is needed in order to have an evenly spreaded load.
If re= plication blocks get corrupted this will be visible in the logs but the nam= enode will auto correct the problem by creating a new version of the block.=
Normally you will have a replication factor of 3=2C but you can change = this=2C if you want data to be spread across more nodes.

Hope this a= nswers some questions.

With Regards=2C
Yves

From: caesarsamsi@mac.com
To: user@hadoop.apache.org
Subje= ct: Monitoring dashboard for Hadoop?
Date: Wed=2C 3 Jun 2015 17:25:43 -0= 400

Hello=2C

&nbs= p=3B

I=92m new to Hadoop and successfully buil= t a fully distributed cluster of 3 nodes (1 master=2C 2 slaves) as a proof = of concept. I have some questions below.

 = =3B

Is there a dashboard to monitor the progre= ss of a mapreduce computation?

1. =3B =3B =3B =3B =3B&n= bsp=3B I=92m looking to ensure the computation gets allocated= and uses the correct number of computation nodes

2. =3B =3B =3B=  =3B =3B =3B Monitor computation on the nodes (up= /down/in-progress/completed)

3. =3B =3B =3B =3B =3B = =3B If possible direct computation to specific group of nodes= (depending on the computation priority).

&nbs= p=3B

Similarly for HDFS

1. =3B =3B = =3B =3B =3B =3B Ensure data file gets replicated = to the correct number of nodes

2. =3B =3B =3B =3B =3B&nbs= p=3B If possible prioritize data replication (i.e. replicate = data files that are accessed frequently to nodes that have better hardware= =2C so some sort of load balancing distribution)

 =3B

Many Thanks=2C Caesar.

= --_3219edaf-3ee2-4ea3-a585-6517c7d91c68_--