incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Incubator Wiki] Update of "HoyaProposal" by SteveLoughran
Date Fri, 03 Jan 2014 13:47:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The "HoyaProposal" page has been changed by SteveLoughran:
https://wiki.apache.org/incubator/HoyaProposal?action=diff&rev1=1&rev2=2

  (AM), which forked the HBase master as child process, and requested and
  released containers for HBase worker nodes based on demand.
  
+ Key features of this version were
+  *  Automatic download and expansion of {{{.tar}}} and {{{.tar.gz}}} binaries from HDFS
-which allows for side-by-side clusters of different versions of an application. 
+  *  The option to bypass the tarball download and run with pre-installed binaries.
+  *  Patching of {{{hadoop-site.xml}}} for filesystem, zookeeper and port bindings as well
as other user-specified configurations when the application is initially specified. The user
identifies a template {{{conf/}}} directory which is snapshotted and then patched.
+  *  Flexing of cluster size -grow or shrink as requested, optionally persisted for future
runs.
+ 
  Since then Hoya has evolved based on the experiences of using the
  previous iterations, and a long-term goal of creating and managing distributed applications
  
+  * Added the notion of a ''Provider'', a set of classes containing the code to support different
applications; factoring out HBase support into one such provider.
+  * Moved from a simple model of "1 one master, many workers", to one of multiple roles;
each role implementing a different part of a larger  distributed application.
+  * Moved the HBase master to be just another role, allowing HBase clusters to be created
with multiple backup master nodes -for better availability.
+  * Added an Accumulo provider, with support for its many different roles: {{{master}}},
{{{tserver}}}, {{{monitor}}}, {{{tracer}}} and {{{gc}}}.
- 
-  1. Deployment of different clustered applications via a provider plugin: HBase and Accumulo
being the two currently supported. 
-  1. A notion of different ''roles'' in an application. For HBase the two roles
- are {{{master}}} and {{{worker}}}; Accumulo has five: {{{master}}}, {{{tserver}}}, {{{monitor}}},
{{{tracer}}} and {{{gc}}}.
-  1. Automatic download and expansion of {{{.tar}}} and {{{.tar.gz}}} binaries from HDFS
- -which allows for side-by-side clusters of different versions of an application.
-  1. The option to bypass the tarball download and run with pre-installed binaries.
-  1. Patching of site XML for ZK bindings as well as other user-specified configurations
when the application
-  is initially specified. The user identifies a template {{{conf/}}} directory which is snapshotted
and then patched.
-  1. Manual flexing of cluster size -grow or shrink as requested, optionally persisted for
future runs.
-  1. Placement tracking, [[https://github.com/hortonworks/hoya/blob/master/src/site/markdown/rolehistory.md|"role
history"]].
-  This is quite a sophisticated little bit of code -we persist the placement history to HDFS
whenever it changes, then use this to build a list of which nodes to request containers on.
It increases the likelihood that the workers come up on nodes that have the data, so even
if you bring up a small hbase cluster in a large YARN cluster, there's not that much data
to be moved around. [It's a best-effort request, not an absolute requirement request, though
that could always be made a switch.
+  * Added [[https://github.com/hortonworks/hoya/blob/master/src/site/markdown/rolehistory.md|role
placement history ]]. This is quite a sophisticated piece of code -we persist the placement
history to HDFS whenever it changes, then use this to build a list of which nodes to request
containers on. It increases the likelihood that the workers come up on nodes that have the
data, so even if you bring up a small hbase cluster in a large YARN cluster, there's not that
much data to be moved around.
+  * Added initial failure tracking, to track servers that appear to be unreliable, and to
recognize when so many role instances are failing that the Hoya cluster should consider itself
failed
-  1. failure tracking, to track servers that appear to be unreliable,
-  and to recognize when so many role instances are failing that the Hoya cluster
-  should consider itself failed
-  1. Secure clusters -for as long as the YARN and HDFS tokens remain valid in the Application
Master.
  
+ In the process the core application code was migrated to Java, support for secure Hadoop
clusters added, and the overall feature set and experience of using the tool improved.
- 
-  * Added the notion of a ''Provider'', a set of classes containing the code to
-    support different applications; factoring out HBase support into one such
-    provider.
- 
-  * Moved from a simple model of "1 one master, many workers", to one of
-    multiple roles; each role implementing a different part of a larger
-    distributed application.
- 
-  * Added an Accumulo provider, with support for the many different roles:
-    masters, tablet servers, monitor, and garbage collector.
- 
-  * Moved the HBase master to be just another role, allowing HBase clusters to
-    be created with multiple backup master nodes -for better availability.
- 
-  * Added (persistent) role placement history so that a restarted cluster can
-    request role instances on the same nodes used earlier -this increases data
-    locality and so can result in faster startup times.
- 
-  * Added initial failure tracking, to track servers that appear to be unreliable,
-  and to recognize when so many role instances are failing that the Hoya cluster
-  should consider itself failed
- 
- In the process the core application code was migrated to Java, support for
- secure Hadoop clusters added, and the overall feature set and experience of
- using the tool improved.
  
  == Rationale ==
  
- The Hadoop "stack" has long included applications above the core HDFS and
+ The Hadoop "stack" has long included applications above the core HDFS and 
  MapReduce layers, with the column table database, Apache HBase, one of the key
  applications. To date, an HBase region server has been expected to have been
  deployed on every server in the cluster, managed by the operations team and
@@ -125, +96 @@

  
  == Initial Goals ==
  
-  * Donate the (already ASF-licensed) Hoya source code and documentation to the
+  1. Donate the (already ASF-licensed) Hoya source code and documentation to the Apache Software
Foundation.
-    Apache Software Foundation.
- 
-  * Setup and standardize the open governance of the Hoya project.
+  1. Setup and standardize the open governance of the Hoya project.
- 
-  * Build a user and developer community
+  1. Build a user and developer community
+  1. Tie in better with HBase, Accumulo and other projects that can take advantage of a YARN
cluster -yet which are not ready to be explicitly migrated to YARN. 
- 
-  * Tie in better with HBase, Accumulo and other projects that can take
-    advantage of a YARN cluster -yet which are not ready to be explicitly
-    migrated to YARN.
- 
-  * Build a system that can be used to migrate more distributed applications
+  1. Build a system that can be used to migrate more distributed applications into YARN clusters.
-    into YARN clusters.
  
  == Current Status ==
  
@@ -172, +135 @@

  
  For Hadoop, it, along with Samza, drives the work of supporting long-lived
  services in YARN, work listed in the
- [YARN-896](https://issues.apache.org/jira/browse/YARN-896) work. While many of
+ [[YARN-896|https://issues.apache.org/jira/browse/YARN-896]] work. While many of
  the issues listed under YARN-896 relate to service longevity, there is also the
  challenge of having low-latency table lookups co-exist with CPU-and-IO
  intensive analytics workloads. This may drive future developments in Hadoop
@@ -192, +155 @@

  
  == Documentation ==
  
+ All Hoya documentation is currently in [[markdown-formatted text files in the source repository|https://github.com/hortonworks/hoya/tree/master/src/site/markdown]];
they will be delivered as part of the initial source donation.
- All Hoya documentation is currently in markdown-formatted text files in the
- source repository; they will be delivered as part of the initial source
- donation.
  
  == Initial Source ==
  
- All initial source can be found at [https://github.com/hortonworks/hoya](https://github.com/hortonworks/hoya)

+ All initial source can be found at [[https://github.com/hortonworks/hoya]]
  
  == Source and IP Submission Plan ==
  
- 1. All source will be moved to Apache Infrastructure
+  1. All source will be moved to Apache Infrastructure
- 1. All outstanding issues in our in-house JIRA infrastructure will be replicated into the
Apache JIRA system.
+  1. All outstanding issues in our in-house JIRA infrastructure will be replicated into the
Apache JIRA system.
  
  == External Dependencies ==
  
  Hoya has no external dependencies except for some Java libraries that are
- considered ASF-compatible (JUnit, SLF4J), and Apache artifacts : Hadoop, HBase,
+ considered ASF-compatible (JUnit, SLF4J, jcommander), and Apache artifacts : Hadoop, HBase,
- Accumulo.
+ Accumulo, Log4J and others.
  
  == Required Resources ==
  
@@ -217, +178 @@

   1. hoya-dev
   1. hoya-commits
   1. hoya-private
+ 
+ Infrastructure 
-  1. git repository
+  1. Git repository
- 
- Jenkins builds on x86-Linux, ARM-Linux and Windows hooked up to JIRA
+  1. Jenkins builds on x86-Linux, ARM-Linux and Windows hooked up to JIRA
- 
- Gerrit would be useful for reviewing, if available.
+  1. Gerrit would be useful for reviewing, if available.
  
  == Initial Committers ==
  
@@ -229, +190 @@

   1. Billie Rinaldi 
   1. Ted Yu
   1. Josh Elser
+  1. Devaraj Das
  
  == Sponsors ==
  

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


Mime
View raw message