hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "ZooKeeper/GSoCMonitoringAndWebInterface" by AndreiSavu
Date Mon, 21 Jun 2010 10:59:23 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "ZooKeeper/GSoCMonitoringAndWebInterface" page has been changed by AndreiSavu.
http://wiki.apache.org/hadoop/ZooKeeper/GSoCMonitoringAndWebInterface?action=diff&rev1=5&rev2=6

--------------------------------------------------

   * Assigned mentor: Patrick Hunt (phunt at apache dot org)
  
  == Abstract ==
- ZooKeeper is a complex distributed system. Understanding how well it is running is tremendously
important. Patrick Hunt has created a [[http://github.com/phunt/zookeeper_dashboard|Django-based
dashboard]] that allows some insight into how ZooKeeper is running. This is the foundation
I'm going to build on. This project would capture much more information from ZooKeeper, adding
hooks to retrieve it where necessary and visualize it in a appealing and useful way. I'm also
going to provide a bunch of monitoring recipes for systems like: Ganglia, Nagios, Cacti.
+ ZooKeeper is a complex distributed system. Understanding how well it is running is tremendously
important. Patrick Hunt has created a [[http://github.com/phunt/zookeeper_dashboard|Django-based
dashboard]] that allows some insight into how ZooKeeper is running. This is the foundation
I'm going to build on. This project would capture much more information from ZooKeeper, adding
hooks to retrieve it where necessary and visualize it in an appealing and useful way. I'm
also going to provide a bunch of monitoring recipes for systems like: Ganglia, Nagios, Cacti.
  
  == Work In Progress ==
-  * monitoring for Cacti and Ganglia
-  * commit as zookeeper-monitoring as a contrib
+  * cleanup and add more tests on zookeeper-monitoring
+  * submit [[http://github.com/andreisavu/zookeeper-monitoring|zookeeper-monitoring]] as
a contrib
+   * going to add a new JIRA for monitoring tools
+   * right now there is only one JIRA opened for Ganglia [[https://issues.apache.org/jira/browse/ZOOKEEPER-613|ZOOKEEPER-613]]
   * [[https://issues.apache.org/jira/browse/ZOOKEEPER-175|ZOOKEEPER-175]]
   * [[https://issues.apache.org/jira/browse/ZOOKEEPER-757|ZOOKEEPER-757]]
   * [[https://issues.apache.org/jira/browse/ZOOKEEPER-613|ZOOKEEPER-613]]
  
  == Done ==
-  * monitoring tools and recipes: [[http://github.com/andreisavu/zookeeper-monitoring|zookeeper-monitoring]]
: Nagios
+  * monitoring tools and recipes: [[http://github.com/andreisavu/zookeeper-monitoring|zookeeper-monitoring]]
: Nagios, Cacti and Ganglia
   * [[https://issues.apache.org/jira/browse/ZOOKEEPER-744|ZOOKEEPER-744]]
  
  == Milestones ==
  === Community Bonding (starts: 26 April ends: 24 May) ===
  Activities:
  
-  * read mail lists archives
+  * read mail lists archives - '''done'''
-  * read source code
+  * read source code- '''done'''
-  * discuss with the community members  (monitoring and administration requirements, production
stories)
+  * discuss with the community members  (monitoring and administration requirements, production
stories) - '''done'''
-  * discuss with the Adobe Hadoop / Hbase team about their specific monitoring requirements
+  * discuss with the Adobe Hadoop / Hbase team about their specific monitoring requirements
- '''done'''
  
  Expected results:
  
-  * understand source code and the known bugs
+  * understand source code and the known bugs - '''done'''
-  * understand how the software is used in production
+  * understand how the software is used in production - '''done'''
+   * ZooKeeper is the kind of service that you put in production and forget about it
+   * got positive feedback: works as expected "out of the box"
+   * monitoring requirements: ensure that it keeps working as expected
-  * understand monitoring requirements
+  * understand monitoring requirements - '''done'''
-  * understand debugging requirements
+  * understand debugging requirements - '''done'''
-  * setup a development environment
+  * setup a development environment - '''done'''
+   * on the local machine running Ubuntu 9.10, java1.6, Eclipse, ant
+   * tracking my changes on github: http://github.com/andreisavu/zookeeper
  
  === Monitoring and Data Collection (starts: 24 May ends: 20 June ) ===
  Activities:
  
-  * deploy small scale (multinode) cluster for development (virtual machines)
+  * deploy small scale (multinode) cluster for development (virtual machines)  - '''done'''
+   * I've used [[http://github.com/phunt/zkconf|zkconf]] for this task. I've deployed local
"clusters" with 3,5 and 9 nodes
-  * identify important health signals add hooks (if needed) for realtime data collection
+  * identify important health signals add hooks (if needed) for realtime data collection
- '''done'''
+   * added new 4letterword 'mntr' for monitoring - going to be released in zookeeper 3.4.0
+   * important signals: latency, packets sent / received, outstanding requests, znode count,
watch count, ephemerals count, followers count, synced followers, pending syncs, open file
descriptor count
-  * create scripts / plugins for cluster monitoring using Cacti, Ganglia, Nagios, SNMP
+  * create scripts / plugins for cluster monitoring using Cacti, Ganglia, Nagios - '''done'''
-  * document script install procedures
+   * [[http://github.com/andreisavu/zookeeper-monitoring|zookeeper-monitoring]]
+  * document script install procedures - '''done''' (I'm making the assumption the user has
previous experience configuring Nagios, Cacti or Ganglia)
-  * collaborate with the Adobe Hadoop / Hbase team and deploy the monitoring scripts in production
+  * collaborate with the Adobe Hadoop / Hbase team and deploy the monitoring scripts in production
- '''work in progress'''
  
  Expected results:
  
-  * production ready scripts / plugins for monitoring
+  * production ready scripts / plugins for monitoring - '''done'''
-  * easy to understand and follow install guides
+  * easy to understand and follow install guides - '''done'''
  
  === Web Application (starts: 20 June ends: 9 august) ===
  Activities:

Mime
View raw message