zookeeper-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [zookeeper] maoling commented on a change in pull request #1073: ZOOKEEPER-3529: add a new doc: zookeeperUseCases.md
Date Sat, 21 Sep 2019 08:09:06 GMT
maoling commented on a change in pull request #1073: ZOOKEEPER-3529: add a new doc: zookeeperUseCases.md
URL: https://github.com/apache/zookeeper/pull/1073#discussion_r326851965

 File path: zookeeper-docs/src/main/resources/markdown/zookeeperUseCases.md
 @@ -0,0 +1,319 @@
+Copyright 2002-2019 The Apache Software Foundation
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+See the License for the specific language governing permissions and
+limitations under the License.
+# ZooKeeper Use Cases
+- Applications and organizations using ZooKeeper include (alphabetically)[1].
+- If your use case wants to be listed here. Please do not hesitate, submit a pull request
or write an email to **dev@zookeeper.apache.org**,
+  and then, your use case will be included.
+- If this documentation has violated your intellectual property rights or you and your company's
privacy,write an email to **dev@zookeeper.apache.org**,
+  we will handle them in a timely manner.
+## Free Software Projects
+### [AdroitLogic UltraESB](http://adroitlogic.org/)
+  - Uses ZooKeeper to implement node coordination, in clustering support. This allows the
management of the complete cluster,
+  or any specific node - from any other node connected via JMX. A Cluster wide command framework
developed on top of the
+  ZooKeeper coordination allows commands that fail on some nodes to be retried etc. We also
support the automated graceful
+  round-robin-restart of a complete cluster of nodes using the same framework[1].
+### [Akka](http://akka.io/)
+  - Akka is the platform for the next generation event-driven, scalable and fault-tolerant
architectures on the JVM.
+  Or: Akka is a toolkit and runtime for building highly concurrent, distributed, and fault
tolerant event-driven applications on the JVM[1].
+### [Eclipse Communication Framework](http://www.eclipse.org/ecf)
+  - The Eclipse ECF project provides an implementation of its Abstract Discovery services
using Zookeeper. ECF itself
+  is used in many projects providing base functionallity for communication, all based on
+### [Eclipse Gyrex](http://www.eclipse.org/gyrex)
+  - The Eclipse Gyrex project provides a platform for building your own Java OSGi based clouds.

+  - ZooKeeper is used as the core cloud component for node membership and management, coordination
of jobs executing among workers,
+  a lock service and a simple queue service and a lot more[1].
+### [GoldenOrb](http://www.goldenorbos.org/)
+  - massive-scale Graph analysis[1]
+### [Juju](https://juju.ubuntu.com/)
+  - Service deployment and orchestration framework, formerly called Ensemble[1].
+### [Katta](http://katta.sourceforge.net/)
+  - Katta serves distributed Lucene indexes in a grid environment.
+  - Zookeeper is used for node, master and index management in the grid[1].
+### [KeptCollections](https://github.com/anthonyu/KeptCollections)
+  - KeptCollections is a library of drop-in replacements for the data structures in the Java
Collections framework.
+  - KeptCollections uses Apache ZooKeeper as a backing store, thus making its data structures
distributed and scalable[1].
+### [Neo4j](https://neo4j.com/)
+  - Neo4j is a Graph Database. It's a disk based, ACID compliant transactional storage engine
for big graphs and fast graph traversals,
+    using external indicies like Lucene/Solr for global searches.
+  - We use ZooKeeper in the Neo4j High Availability components for write-master election,
+    read slave coordination and other cool stuff. ZooKeeper is a great and focused project
- we like![1]
+### [Norbert](http://sna-projects.com/norbert)
+  - Partitioned routing and cluster management[1].
+### [spring-cloud-zookeeper](https://spring.io/projects/spring-cloud-zookeeper)
+  - Spring Cloud Zookeeper provides Apache Zookeeper integrations for Spring Boot apps through
+    and binding to the Spring Environment and other Spring programming model idioms. With
a few simple annotations
+    you can quickly enable and configure the common patterns inside your application and
build large distributed systems with Zookeeper.
+    The patterns provided include Service Discovery and Distributed Configuration.
+### [Talend ESB](http://www.talend.com/products-application-integration/application-integration-esb-se.php)
+  - Talend ESB is a versatile and flexible, enterprise service bus.
+  - It uses ZooKeeper as endpoint repository of both REST and SOAP Web services.
+    By using ZooKeeper Talend ESB is able to provide failover and load balancing capabilities
in a very light-weight manner[1]
+### [redis_failover](https://github.com/ryanlecompte/redis_failover)
+  - Redis Failover is a ZooKeeper-based automatic master/slave failover solution for Ruby.[1]
+## Apache Projects
+### [Apache Accumulo](https://accumulo.apache.org/)
+  - Accumulo is a distributed key/value store that provides expressive, cell-level access
+  - Apache ZooKeeper plays a central role within the Accumulo architecture. Its quorum consistency
model supports an overall
+    Accumulo architecture with no single points of failure. Beyond that, Accumulo leverages
ZooKeeper to store and communication 
+    configuration information for users and tables, as well as operational states of processes
and tablets.[2]
+### [Apache BookKeeper](https://bookkeeper.apache.org/)
+  - A scalable, fault-tolerant, and low-latency storage service optimized for real-time workloads.
+  - BookKeeper requires a metadata storage service to store information related to ledgers
and available bookies. BookKeeper currently uses
+    ZooKeeper for this and other tasks[3].
+### [Apache CXF DOSGi](http://cxf.apache.org/distributed-osgi.html)
+  - Apache CXF is an open source services framework. CXF helps you build and develop services
using frontend programming
+    APIs, like JAX-WS and JAX-RS. These services can speak a variety of protocols such as
+    or CORBA and work over a variety of transports such as HTTP, JMS or JBI.
+  - The Distributed OSGi implementation at Apache CXF uses ZooKeeper for its Discovery functionality.[4]
+### [Apache Druid(Incubating)](https://druid.apache.org/)
+  - Apache Druid (incubating) is a high performance real-time analytics database.
+  - Apache Druid (incubating) uses Apache ZooKeeper (ZK) for management of current cluster
state. The operations that happen over ZK are[27]:
+    - Coordinator leader election
+    - Segment "publishing" protocol from Historical and Realtime
+    - Segment load/drop protocol between Coordinator and Historical
+    - Overlord leader election
+    - Overlord and MiddleManager task management
+### [Apache Dubbo](http://dubbo.apache.org)
+  - Apache Dubbo is a high-performance, java based open source RPC framework.
+  - Zookeeper is used for service registration discovery and configuration management in
+### [Apache Flink](https://flink.apache.org/)
+  - Apache Flink is a framework and distributed processing engine for stateful computations
over unbounded and bounded data streams.
+    Flink has been designed to run in all common cluster environments, perform computations
at in-memory speed and at any scale.
+  - To enable JobManager High Availability you have to set the high-availability mode to
zookeeper, configure a ZooKeeper quorum and set up a masters file with all JobManagers hosts
and their web UI ports.
+    Flink leverages ZooKeeper for distributed coordination between all running JobManager
instances. ZooKeeper is a separate service from Flink,
+    which provides highly reliable distributed coordination via leader election and light-weight
consistent state storage[23].
+### [Apache Flume](https://flume.apache.org/)
+  - Flume is a distributed, reliable, and available service for efficiently collecting, aggregating,
and moving large amounts
+    of log data. It has a simple and flexible architecture based on streaming data flows.
It is robust and fault tolerant
+    with tunable reliability mechanisms and many failover and recovery mechanisms. It uses
a simple extensible data model
+    that allows for online analytic application.
+  - Flume supports Agent configurations via Zookeeper. This is an experimental feature.[5]
+### [Apache Hadoop](http://hadoop.apache.org/)
+  - The Apache Hadoop software library is a framework that allows for the distributed processing
of large data sets across
+    clusters of computers using simple programming models. It is designed to scale up from
single servers to thousands of machines,
+    each offering local computation and storage. Rather than rely on hardware to deliver
+    the library itself is designed to detect and handle failures at the application layer,
so delivering a highly-available service on top of a cluster of computers, each of which may
be prone to failures.
+  - The implementation of automatic HDFS failover relies on ZooKeeper for the following things:
+    - **Failure detection** - each of the NameNode machines in the cluster maintains a persistent
session in ZooKeeper.
+      If the machine crashes, the ZooKeeper session will expire, notifying the other NameNode
that a failover should be triggered.
+    - **Active NameNode election** - ZooKeeper provides a simple mechanism to exclusively
elect a node as active. If the current active NameNode crashes,
+      another node may take a special exclusive lock in ZooKeeper indicating that it should
become the next active.
+  - The ZKFailoverController (ZKFC) is a new component which is a ZooKeeper client which
also monitors and manages the state of the NameNode.
+    Each of the machines which runs a NameNode also runs a ZKFC, and that ZKFC is responsible
+    - **Health monitoring** - the ZKFC pings its local NameNode on a periodic basis with
a health-check command.
+      So long as the NameNode responds in a timely fashion with a healthy status, the ZKFC
considers the node healthy.
+      If the node has crashed, frozen, or otherwise entered an unhealthy state, the health
monitor will mark it as unhealthy.
+    - **ZooKeeper session management** - when the local NameNode is healthy, the ZKFC holds
a session open in ZooKeeper.
+      If the local NameNode is active, it also holds a special “lock” znode. This lock
uses ZooKeeper’s support for “ephemeral” nodes;
+      if the session expires, the lock node will be automatically deleted.
+    - **ZooKeeper-based election** - if the local NameNode is healthy, and the ZKFC sees
that no other node currently holds the lock znode,
+      it will itself try to acquire the lock. If it succeeds, then it has “won the election”,
and is responsible for running a failover to make its local NameNode active.
+      The failover process is similar to the manual failover described above: first, the
previous active is fenced if necessary,
+      and then the local NameNode transitions to active state.[7]
+### [Apache HBase](https://hbase.apache.org/)
+  - HBase is the Hadoop database. It's an open-source, distributed, column-oriented store
+  - HBase uses ZooKeeper for master election, server lease management, bootstrapping, and
coordination between servers.
+    A distributed Apache HBase installation depends on a running ZooKeeper cluster. All participating
nodes and clients
+    need to be able to access the running ZooKeeper ensemble.[8]
+  - As you can see, ZooKeeper is a fundamental part of HBase. All operations that require
coordination, such as Regions
+    assignment, Master-Failover, replication, and snapshots, are built on ZooKeeper[20].
+### [Apache Helix](http://helix.apache.org/)
 Review comment:
   - Now this PR re-enters into a state of reviewing.
   - I'am always here to wait whether someone else wants to add other additions.

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

View raw message