hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jerry...@apache.org
Subject hbase git commit: HBASE-17430 Changed link from Google search to a direct link in docs
Date Mon, 09 Jan 2017 04:40:33 GMT
Repository: hbase
Updated Branches:
  refs/heads/master f92a14ade -> 97fd9051f


HBASE-17430 Changed link from Google search to a direct link in docs

Signed-off-by: Jerry He <jerryjch@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/97fd9051
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/97fd9051
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/97fd9051

Branch: refs/heads/master
Commit: 97fd9051f44e8fbe4ca99789bad7c35ede389b88
Parents: f92a14a
Author: Jan Hentschel <jan.hentschel@ultratendency.com>
Authored: Fri Jan 6 14:34:36 2017 +0100
Committer: Jerry He <jerryjch@apache.org>
Committed: Sun Jan 8 20:39:44 2017 -0800

----------------------------------------------------------------------
 src/main/asciidoc/_chapters/architecture.adoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/97fd9051/src/main/asciidoc/_chapters/architecture.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc
index 339566a..e51cb14 100644
--- a/src/main/asciidoc/_chapters/architecture.adoc
+++ b/src/main/asciidoc/_chapters/architecture.adoc
@@ -873,7 +873,7 @@ The compressed BlockCache is disabled by default. To enable it, set `hbase.block
 
 As write requests are handled by the region server, they accumulate in an in-memory storage
system called the _memstore_. Once the memstore fills, its content are written to disk as
additional store files. This event is called a _memstore flush_. As store files accumulate,
the RegionServer will <<compaction,compact>> them into fewer, larger files. After
each flush or compaction finishes, the amount of data stored in the region has changed. The
RegionServer consults the region split policy to determine if the region has grown too large
or should be split for another policy-specific reason. A region split request is enqueued
if the policy recommends it.
 
-Logically, the process of splitting a region is simple. We find a suitable point in the keyspace
of the region where we should divide the region in half, then split the region's data into
two new regions at that point. The details of the process however are not simple.  When a
split happens, the newly created _daughter regions_ do not rewrite all the data into new files
immediately. Instead, they create small files similar to symbolic link files, named link:http://www.google.com/url?q=http%3A%2F%2Fhbase.apache.org%2Fapidocs%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fio%2FReference.html&sa=D&sntz=1&usg=AFQjCNEkCbADZ3CgKHTtGYI8bJVwp663CA[Reference
files], which point to either the top or bottom part of the parent store file according to
the split point. The reference file is used just like a regular data file, but only half of
the records are considered. The region can only be split if there are no more references to
the immutable data files of the parent region. Those reference files are clea
 ned gradually by compactions, so that the region will stop referring to its parents files,
and can be split further.
+Logically, the process of splitting a region is simple. We find a suitable point in the keyspace
of the region where we should divide the region in half, then split the region's data into
two new regions at that point. The details of the process however are not simple.  When a
split happens, the newly created _daughter regions_ do not rewrite all the data into new files
immediately. Instead, they create small files similar to symbolic link files, named link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/Reference.html[Reference
files], which point to either the top or bottom part of the parent store file according to
the split point. The reference file is used just like a regular data file, but only half of
the records are considered. The region can only be split if there are no more references to
the immutable data files of the parent region. Those reference files are cleaned gradually
by compactions, so that the region will stop referring to its parents files, and c
 an be split further.
 
 Although splitting the region is a local decision made by the RegionServer, the split process
itself must coordinate with many actors. The RegionServer notifies the Master before and after
the split, updates the `.META.` table so that clients can discover the new daughter regions,
and rearranges the directory structure and data files in HDFS. Splitting is a multi-task process.
To enable rollback in case of an error, the RegionServer keeps an in-memory journal about
the execution state. The steps taken by the RegionServer to execute the split are illustrated
in <<regionserver_split_process_image>>. Each step is labeled with its step number.
Actions from RegionServers or Master are shown in red, while actions from the clients are
show in green.
 


Mime
View raw message