hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/HBaseWireCompatibility" by Misty
Date Tue, 20 Oct 2015 00:08:03 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/HBaseWireCompatibility" page has been changed by Misty:

- <<TableOfContents(5)>>
+ The HBase Wiki is in the process of being decommissioned. The info that used to be on this
page has moved to http://hbase.apache.org/book.html#hbase.versioning. Please update your bookmarks.
- === Glossary ===
- ----
- ||<tableclass="confluenceTable"class="confluenceTh">Term ||<class="confluenceTh">Definition
- ||<class="confluenceTd">Major version ||<class="confluenceTd">First number in
the version, to the left of the period.  e.g. in version 2.3, the major version is "2" ||
- ||<class="confluenceTd">Minor version ||<class="confluenceTd">Second number
in the version, immediately to the right of the period.  e.g. in version 2.3, the minor version
is "3" ||
- ||<class="confluenceTd">Compatibility window ||<class="confluenceTd">Range of
consecutive major versions where compatibility between two entities is guaranteed ||
- === Motivation and Goals ===
- ----
- The current lack of a concrete versioning story for HBase is limiting  from both an operational
and development perspective.  We propose a  "first-pass" versioning story (that can be expanded
upon later) that  addresses the following use cases and concerns:
- '''Operations'''
-  * '''Decouple client applications from HBase''':  HBase clients are  part of a separate
application and often administrated separately from  the HBase cluster. Today, the application
and cluster must be upgraded  in lockstep.  Clients should interoperate with HBase RS's and
masters  that are running different major versions.  This allows for the  following operational
-   * Multiple pods: HBase clients may write to multiple HBase clusters  / pods (sharded clusters)
and the shards may be upgraded separately.
-   * Application-level replication: HBase installation with active and  standby clusters
should be able to upgrade, and HBase clients can work  with both.
-  * '''No downtime for minor version upgrades'''
- '''Development'''
-  * '''Simplified support for bugfixes, upgrades, and testing''' -  no need for specialized
migration scripts
-  * '''Higher developer cadence in the community''' - can add functionality and not worry
about breaking version compatibility
- === Requirements ===
- ----
-  * HBase server-server running different '''minor''' versions shall interoperate in an extensible
-  * HBase client-server running different '''major''' versions shall interoperate in an extensible
-   * For example, in a scenario where client is running with version A  and server is running
with version B: anything the other side does not  understand is ignored, provided defaults
for, or otherwise handled in an  appropriate manner.
-  * Formats and protocols shall be extensible to allow for new functionality such as RPC
-  * Developers shall be able to augment RPC protocol with '''new''' methods within minor
and major version upgrades.
-  * Critical path operations (Get/Put) performance shall suffer no more  than 10% from the
current 0.92 version's performance on YCSB load tests  (i.e. read/update/scan/insert should
individually be no more than 10%  slower).
- === Design ===
- ----
- ==== Wire format ====
- Protobuf vs. Thrift vs. Avro
- We propose to use protobuf for wire format. The primary reason is  that the current HBase
RPC engine (see HADOOP-7379) supports  protobuf-encoded data, and protobuf is relatively more
stable than the  alternatives.  In addition, Hadoop RPC uses protobuf, and the community 
may eventually want Hadoop and HBase to share the same RPC.
- We also propose to change the HBase RPC connection header from  Writable to  protobuf so
that the HBase RPC is programming language  agnostic.
- ==== RPC ====
- Currently, the HBase RPC engine does not support async IO or protocol  negotiation.  These
features don't impact compatibility and therefore  can evolve separately and are not in scope
for this document.
- ==== Interfaces ====
- {{http://docs.google.com/a/cloudera.com/leaf?id=0BzYqRa05S66NMDcxMjUyYTMtZWE2Yy00ZmIyLThiMjgtMjJkNGU0NGU5OTg1}}
-  1. Client talks to ZK to find out the location of the master and the root region server.
-  1. Client applications talk to RS using '''HRegionInterface''' to read from/write to/scan
a table, etc..
-  1. Client applications talk to master using '''HMasterInterface''' to dynamically create
a table, add a column family, and so on.
-  1. Master talks to RS using '''HRegionInterface''' to open/close/move/split/flush regions,
and so on.
-  1. Master puts data in ZK to store the active master and root region  server location,
create log splitting tasks, track RS's status, and so  on.
-  1. RS reads data in ZK to track log splitting tasks and update it to  grab a task and report
status, create a node for the RS so that master  can track the status of this RS, track master
location  and cluster  status, and so on.
-  1. RS talks to master using '''HMasterRegionInterface''' to report RS load, RS fatal errors,
RS starts-up.
-  1. Occasionally, RS talks to root region or meta region with '''HRegionInterface''' to
check the status of a region, create new daughter regions in region splitting, and so on.
- === Phasing ===
- ----
- The order of phases is based on priority. They can be done in parallel if there are enough
- ==== Phase 0: HBASE-4403: Separate existing APIs into public and private interfaces ====
- In order to define which APIs can be changed, we need to separate existing APIs into public
and private.
- ==== Phase 1: Compatibility between client applications and HBase clusters ====
- Goal:
-  . To make HBase client applications work properly with HBase clusters of different major
and minor versions.
- Note: deal with 1, 2, 3 (we get 8 "for free") in the interface graph. These tasks can be
sub-tasks of [[https://issues.apache.org/jira/browse/HBASE-5305|HBASE-5305 Improve cross-version
compatibility & upgradeability]] or [[https://issues.apache.org/jira/browse/HBASE-5306|HBASE-5306
Add support for protocol buffer based RPC]].  HBASE-5306 can also include a new RPC engine
(the latest Hadoop one). This plan focuses on the data encoding/decoding.
- Tasks:
-  * Replace RPC negotiation with extensible PB-based types
-  * Replace root and master address znodes in ZK with PB-enabled types  (goal: client's ZK
interactions become extensible) (1 in the graph)
-  * Replace existing HRegionInterface calls for read from/write to/scan  a table...  with
PB-enabled types (goal: client->RS and RS->RS  RPC becomes extensible) (2 in the graph)
-  * Replace existing HMasterInterface calls with PB-enabled types (goal: client->master
RPC becomes extensible) (3 in the graph)
-  * Replace data stored in .META. and -ROOT- tables with PB-enabled  types (goal: client
can read from old and/or new .META. and -ROOT-  tables) (2 in the graph)
- ==== Phase 2: HBase cluster rolling upgrade within same major version ====
- Goal:
-  . To make an HBase cluster able to roll upgrade within the same major version
- Note: deal with 4, 5, 6, 7 in the interface graph.
- Tasks:
-  * Replace existing HRegionInterface calls for  open/close/move/split/flush regions... with
PB-enabled types (goal:  master->RS RPC becomes extensible) (4 in the graph)
-  * Replace Writables used in ZK for communication between RS and  master with PB-enabled
types (goal: RS and master ZK interactions become  extensible) (5, 6 in the graph)
-  * Replace existing HMasterRegionInterface calls with PB-enabled types  (goal: RS->master
RPC becomes extensible) (7 in the graph)
-  * Add version information to each server's ZK data (master and RS's)  (goal: tracking live
version numbers, used for automatic wire-off of new  features in persistent data formats until
all servers have hit new  version) (5, 6 in the graph)
-  * Add version information to RS's on master status UI
- === Open questions ===
- ----
- '''Technical'''
-  . - How does ZK security and HBase RPC security play into this? Should be orthogonal?
-  . - Should pluggable encodings (thrift/avro/pb/writable) be in scope?
-  . - Should async IO servers and clients be in scope or not?
- '''Policy'''
-  . - What is the policy for existing versions (89, 90,  92, 94) -- do we  support them or
require on major upgrade before they get this story?
-  . - Developers should be able to remove deprecated methods or arguments to  maintain flexibility,
but can't do that within the compatibility window.  What should be our compatibility window?
2 years (roughly 4  major versions)?
-  . - What is the ZK version interoperability story?
-  . - What is the HDFS version interoperability story?
-  . - Should architectural-level changes require a major version bump?
- === Appendix ===
- ----
- ==== Future work (out of scope of this document) ====
-  * Possible to extend RPC with meta-data that can enable new functionality like RPC tracing
-  * Unify this with Hadoop RPC
-  * Online rolling upgrade of single cluster between major versions:  Today, major version
upgrades of a single cluster require downtime to  upgrade all services in lockstep, while
some minor versions updates can  be upgraded via the rolling-restart script.  HBase should
remain  available through this process.
-  * Partial rollout: HBase clusters should allow for some nodes to  "try" a newer version
for testing purposes.  Today, this is a manual  process and possible only within minor versions.
(likely possible, would  like to not exclude this possibility).
-  * Cluster configuration changes: HBase should remain available as  configuration changes
(hbase-site.xml) or hotfixes are applied. Today,  rolling-restart script can be used to perform
this operation.
-  * Replication across different versions
-  * Disaster recovery: Operators should be able to smoke test a new  version during the rolling
upgrade before turning on the new features  for general use. If anything is wrong during the
rolling upgrade, it  should be able to roll back.
-  * ZK wire compatibility: is necessary for RPCs between different  versions of HBase and
ZK.  Currently ZK supports backward compatibility  for one version only. Different versions
of HBase could support  different ZK versions.
-  * HDFS wire compatibility
-  * Data format changes may prevent minor or major version roll-back.
-  * Security RPC data compression/encryption changes may prevent minor or major version roll-back
-  * Persistent Data is stored in version specific formats in HDFS (xml    configs, regioninfo,
tableinfo).  Some of these data encodings and    formats are directly exposed; for example,
ZK is not exposed as an API.
- === References ===
- ----
-  . Dapper: http://research.google.com/pubs/pub36356.html
-  . Cross version upgrade and compatibility: https://issues.apache.org/jira/browse/HBASE-5305
-  . Add protbuf based RPC to HBase: https://issues.apache.org/jira/browse/HBASE-5306
-  . Redo IPC/RPC: https://issues.apache.org/jira/browse/HBASE-2182
-  . HDFS wire compatibility: [[https://issues.apache.org/jira/browse/HADOOP-7347|HADOOP-7347]]
-  . HDFS client wire compatibility: [[https://issues.apache.org/jira/browse/HDFS-2060|HDFS-2060]]
-  . HDFS data protocol wire compatibility: [[https://issues.apache.org/jira/browse/HDFS-2058|HDFS-2058]]
-  . Use protobuf objects in existing IPC: [[https://issues.apache.org/jira/browse/HADOOP-7379|HADOOP-7379]]
- === Meeting notes ===
- * [[HBaseWireCompatibility20120221]]

View raw message