hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bhavanisu...@apache.org
Subject [incubator-hudi] branch asf-site updated: [HUDI-590] [DOCS] Cut doc version 0.5.1 and update README with instruction to cut doc version from Mac
Date Fri, 06 Mar 2020 23:17:17 GMT
This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new d02c401  [HUDI-590] [DOCS] Cut doc version 0.5.1 and update README with instruction to cut doc version from Mac
d02c401 is described below

commit d02c4012270c209bf7f2c322d879572a17a0acd1
Author: Bhavani Sudha Saktheeswaran <bhasudha@uber.com>
AuthorDate: Thu Mar 5 11:08:39 2020 -0800

    [HUDI-590] [DOCS] Cut doc version 0.5.1 and update README with instruction to cut doc version from Mac
    
    Updating site with recent doc changes
---
 README.md                                          |   19 +
 content/404.html                                   |    4 +
 content/activity.html                              |    4 +
 content/asf.html                                   |    4 +
 content/assets/css/main.css                        |    2 +-
 content/assets/js/lunr/lunr-store.js               |  175 ++-
 content/cn/activity.html                           |    4 +
 content/cn/community.html                          |    4 +
 content/cn/contributing.html                       |   13 +-
 content/cn/docs/0.5.0-docs-versions.html           |   10 +-
 ....0-docs-versions.html => 0.5.1-comparison.html} |   96 +-
 content/cn/docs/0.5.1-concepts.html                |  613 +++++++++++
 content/cn/docs/0.5.1-configurations.html          |  860 +++++++++++++++
 content/cn/docs/0.5.1-deployment.html              |  813 ++++++++++++++
 .../docs/0.5.1-docker_demo.html}                   |  286 +++--
 ...docs-versions.html => 0.5.1-docs-versions.html} |   39 +-
 ....0-docs-versions.html => 0.5.1-gcs_hoodie.html} |  112 +-
 content/cn/docs/0.5.1-migration_guide.html         |  444 ++++++++
 ...0-docs-versions.html => 0.5.1-performance.html} |  109 +-
 ....0-docs-versions.html => 0.5.1-powered_by.html} |  119 +-
 ...0.5.0-docs-versions.html => 0.5.1-privacy.html} |   73 +-
 ...querying_data.html => 0.5.1-querying_data.html} |   68 +-
 content/cn/docs/0.5.1-quick-start-guide.html       |  539 +++++++++
 ...5.0-docs-versions.html => 0.5.1-s3_hoodie.html} |  130 ++-
 content/cn/docs/0.5.1-use_cases.html               |  445 ++++++++
 content/cn/docs/0.5.1-writing_data.html            |  608 ++++++++++
 content/cn/docs/docs-versions.html                 |   10 +-
 content/cn/docs/querying_data.html                 |   32 +
 content/cn/releases.html                           |    4 +
 content/{roadmap.html => cn/security.html}         |   65 +-
 content/community.html                             |   24 +-
 content/contributing.html                          |   13 +-
 content/docs/0.5.0-docs-versions.html              |   10 +-
 content/docs/0.5.1-comparison.html                 |  433 ++++++++
 content/docs/0.5.1-concepts.html                   |  627 +++++++++++
 content/docs/0.5.1-configurations.html             |  824 ++++++++++++++
 content/docs/0.5.1-deployment.html                 |  984 +++++++++++++++++
 .../{docker_demo.html => 0.5.1-docker_demo.html}   |   30 +-
 ...docs-versions.html => 0.5.1-docs-versions.html} |   38 +-
 ....0-docs-versions.html => 0.5.1-gcs_hoodie.html} |  112 +-
 content/docs/0.5.1-migration_guide.html            |  443 ++++++++
 ...0-docs-versions.html => 0.5.1-performance.html} |  112 +-
 content/docs/0.5.1-powered_by.html                 |  462 ++++++++
 ...0.5.0-docs-versions.html => 0.5.1-privacy.html} |   73 +-
 ...querying_data.html => 0.5.1-querying_data.html} |  298 +++--
 ...art-guide.html => 0.5.1-quick-start-guide.html} |   43 +-
 ...5.0-docs-versions.html => 0.5.1-s3_hoodie.html} |  130 ++-
 ...5.0-docs-versions.html => 0.5.1-structure.html} |   70 +-
 content/docs/0.5.1-use_cases.html                  |  446 ++++++++
 content/docs/0.5.1-writing_data.html               |  632 +++++++++++
 content/docs/docker_demo.html                      |    2 +-
 content/docs/docs-versions.html                    |   10 +-
 content/docs/querying_data.html                    |  266 +++--
 content/docs/quick-start-guide.html                |    3 +-
 content/releases.html                              |   27 +-
 content/roadmap.html                               |    4 +
 content/{roadmap.html => security.html}            |   57 +-
 content/sitemap.xml                                |  140 +++
 content/strata.html                                |    4 +
 docs/_config.yml                                   |   28 +
 docs/_data/navigation.yml                          |   67 ++
 docs/_docs/0.5.1/0_1_s3_filesystem.cn.md           |   83 ++
 docs/_docs/0.5.1/0_1_s3_filesystem.md              |   82 ++
 docs/_docs/0.5.1/0_2_gcs_filesystem.cn.md          |   63 ++
 docs/_docs/0.5.1/0_2_gcs_filesystem.md             |   62 ++
 docs/_docs/0.5.1/0_3_migration_guide.cn.md         |   74 ++
 docs/_docs/0.5.1/0_3_migration_guide.md            |   72 ++
 docs/_docs/0.5.1/0_4_docker_demo.cn.md             | 1154 +++++++++++++++++++
 docs/_docs/0.5.1/0_4_docker_demo.md                | 1163 ++++++++++++++++++++
 docs/_docs/0.5.1/1_1_quick_start_guide.cn.md       |  162 +++
 docs/_docs/0.5.1/1_1_quick_start_guide.md          |  220 ++++
 docs/_docs/0.5.1/1_2_structure.md                  |   22 +
 docs/_docs/0.5.1/1_3_use_cases.cn.md               |   69 ++
 docs/_docs/0.5.1/1_3_use_cases.md                  |   68 ++
 docs/_docs/0.5.1/1_4_powered_by.cn.md              |   59 +
 docs/_docs/0.5.1/1_4_powered_by.md                 |   70 ++
 docs/_docs/0.5.1/1_5_comparison.cn.md              |   50 +
 docs/_docs/0.5.1/1_5_comparison.md                 |   58 +
 docs/_docs/0.5.1/2_1_concepts.cn.md                |  156 +++
 docs/_docs/0.5.1/2_1_concepts.md                   |  173 +++
 docs/_docs/0.5.1/2_2_writing_data.cn.md            |  224 ++++
 docs/_docs/0.5.1/2_2_writing_data.md               |  253 +++++
 docs/_docs/0.5.1/2_3_querying_data.cn.md           |  178 +++
 docs/_docs/0.5.1/2_3_querying_data.md              |  201 ++++
 docs/_docs/0.5.1/2_4_configurations.cn.md          |  470 ++++++++
 docs/_docs/0.5.1/2_4_configurations.md             |  433 ++++++++
 docs/_docs/0.5.1/2_5_performance.cn.md             |   64 ++
 docs/_docs/0.5.1/2_5_performance.md                |   66 ++
 docs/_docs/0.5.1/2_6_deployment.cn.md              |  435 ++++++++
 docs/_docs/0.5.1/2_6_deployment.md                 |  598 ++++++++++
 docs/_docs/0.5.1/3_1_privacy.cn.md                 |   25 +
 docs/_docs/0.5.1/3_1_privacy.md                    |   24 +
 docs/_docs/0.5.1/3_2_docs_versions.cn.md           |   21 +
 docs/_docs/0.5.1/3_2_docs_versions.md              |   19 +
 docs/_includes/nav_list                            |    7 +
 docs/_includes/quick_link.html                     |    4 +-
 docs/_pages/releases.md                            |    2 +-
 97 files changed, 18114 insertions(+), 886 deletions(-)

diff --git a/README.md b/README.md
index bffe5a3..bee1503 100644
--- a/README.md
+++ b/README.md
@@ -61,6 +61,17 @@ mkdir -p $VERSION && cp *.md $VERSION/
 
 This step changes the permalink (location where these pages would be placed) with a version prefix and also changes links to each other.
 
+Mac users please use these commands:
+```
+cd $VERSION
+sed -i '' -e "s/permalink: \/docs\//permalink: \/docs\/${VERSION}-/g" *.md
+sed -i '' -e "s/permalink: \/cn\/docs\//permalink: \/cn\/docs\/${VERSION}-/g" *.cn.md
+sed -i '' -e "s/](\/docs\//](\/docs\/${VERSION}-/g" *.md
+sed -i '' -e "s/](\/cn\/docs\//](\/cn\/docs\/${VERSION}-/g" *.cn.md
+for f in *.md; do [ -f $f ] &&  sed -i '' -e "1s/^//p; 1s/^.*/version: ${VERSION}/" $f; done
+```
+
+Non Mac please use these:
 ```
 cd $VERSION
 sed -i "s/permalink: \/docs\//permalink: \/docs\/${VERSION}-/g" *.md
@@ -93,6 +104,14 @@ render the new version's equivalent navigation links.
 {% endif %}
 ```
 
+Final steps:
+ - In `_config.yml` add a new subsection under `previous_docs: ` for this version similar to `  - version: 0.5.0`
+ - Edit `docs/_pages.index.md` to point to the latest release. Change the text of latest release and edit the href 
+ link to point to the release tag in github.
+ - in `docs/_pages/releases.md` Add a new section on the very top for this release. Refer to `Release 0.5.0-incubating` 
+ for reference. Ensure the links for github release tag, docs, source release, raw release notes are pointing to this 
+ latest release. Also include following subsections - `Download Information`, `Release Highlights` and `Raw Release Notes`.
+ 
 #### Link to this version's doc
 
 
diff --git a/content/404.html b/content/404.html
index 284b85f..19ac986 100644
--- a/content/404.html
+++ b/content/404.html
@@ -156,6 +156,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
diff --git a/content/activity.html b/content/activity.html
index 34cc4e9..f965446 100644
--- a/content/activity.html
+++ b/content/activity.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
diff --git a/content/asf.html b/content/asf.html
index a1adea2..d735edd 100644
--- a/content/asf.html
+++ b/content/asf.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
diff --git a/content/assets/css/main.css b/content/assets/css/main.css
index b4ed304..73fc632 100644
--- a/content/assets/css/main.css
+++ b/content/assets/css/main.css
@@ -1 +1 @@
-table{border-color:#1ab7ea !important}.page a{color:#3b9cba !important}.page__content{font-size:17px}.page__content.releases{font-size:17px}.page__footer{font-size:15px !important}.page__footer a{color:#3b9cba !important}.page__content .notice,.page__content .notice--primary,.page__content .notice--info,.page__content .notice--warning,.page__content .notice--success,.page__content .notice--danger{font-size:0.8em !important}.page__content table{font-size:0.8em !important}.page__content ta [...]
+table{border-color:#1ab7ea !important}.page a{color:#3b9cba !important}.page__content{font-size:17px}.page__content.releases{font-size:17px}.page__footer{font-size:15px !important}.page__footer a{color:#3b9cba !important}.page__content .notice,.page__content .notice--primary,.page__content .notice--info,.page__content .notice--warning,.page__content .notice--success,.page__content .notice--danger{font-size:0.8em !important}.page__content table{font-size:0.8em !important}.page__content ta [...]
diff --git a/content/assets/js/lunr/lunr-store.js b/content/assets/js/lunr/lunr-store.js
index 606d4d1..4c75c1a 100644
--- a/content/assets/js/lunr/lunr-store.js
+++ b/content/assets/js/lunr/lunr-store.js
@@ -155,18 +155,183 @@ var store = [{
         "url": "http://0.0.0.0:4000/docs/0.5.0-privacy.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "文档版本",
-        "excerpt":"                                  Latest             英文版             中文版                                      0.5.0             英文版             中文版                       ","categories": [],
+        "excerpt":"                                  Latest             英文版             中文版                                      0.5.1             英文版             中文版                                      0.5.0             英文版             中文版                       ","categories": [],
         "tags": [],
         "url": "http://0.0.0.0:4000/cn/docs/0.5.0-docs-versions.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "Docs Versions",
-        "excerpt":"                                  Latest             English Version             Chinese Version                                      0.5.0             English Version             Chinese Version                       ","categories": [],
+        "excerpt":"                                  Latest             English Version             Chinese Version                                      0.5.1             English Version             Chinese Version                                      0.5.0             English Version             Chinese Version                       ","categories": [],
         "tags": [],
         "url": "http://0.0.0.0:4000/docs/0.5.0-docs-versions.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "S3 Filesystem",
         "excerpt":"In this page, we explain how to get your Hudi spark job to store into AWS S3. AWS configs There are two configurations required for Hudi-S3 compatibility: Adding AWS Credentials for Hudi Adding required Jars to classpath AWS Credentials Simplest way to use Hudi with S3, is to configure your...","categories": [],
         "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-s3_hoodie.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "S3 Filesystem",
+        "excerpt":"In this page, we explain how to get your Hudi spark job to store into AWS S3. AWS configs There are two configurations required for Hudi-S3 compatibility: Adding AWS Credentials for Hudi Adding required Jars to classpath AWS Credentials Simplest way to use Hudi with S3, is to configure your...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-s3_hoodie.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "GCS Filesystem",
+        "excerpt":"For Hudi storage on GCS, regional buckets provide an DFS API with strong consistency. GCS Configs There are two configurations required for Hudi GCS compatibility: Adding GCS Credentials for Hudi Adding required jars to classpath GCS Credentials Add the required configs in your core-site.xml from where Hudi can fetch them....","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-gcs_hoodie.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "GCS Filesystem",
+        "excerpt":"For Hudi storage on GCS, regional buckets provide an DFS API with strong consistency. GCS Configs There are two configurations required for Hudi GCS compatibility: Adding GCS Credentials for Hudi Adding required jars to classpath GCS Credentials Add the required configs in your core-site.xml from where Hudi can fetch them....","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-gcs_hoodie.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Migration Guide",
+        "excerpt":"Hudi maintains metadata such as commit timeline and indexes to manage a dataset. The commit timelines helps to understand the actions happening on a dataset as well as the current state of a dataset. Indexes are used by Hudi to maintain a record key to file id mapping to efficiently...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-migration_guide.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Migration Guide",
+        "excerpt":"Hudi maintains metadata such as commit timeline and indexes to manage a table. The commit timelines helps to understand the actions happening on a table as well as the current state of a table. Indexes are used by Hudi to maintain a record key to file id mapping to efficiently...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-migration_guide.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Docker Demo",
+        "excerpt":"A Demo using docker containers Lets use a real world example to see how hudi works end to end. For this purpose, a self contained data infrastructure is brought up in a local docker cluster within your computer. The steps have been tested on a Mac laptop Prerequisites Docker Setup...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-docker_demo.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Docker Demo",
+        "excerpt":"A Demo using docker containers Lets use a real world example to see how hudi works end to end. For this purpose, a self contained data infrastructure is brought up in a local docker cluster within your computer. The steps have been tested on a Mac laptop Prerequisites Docker Setup...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-docker_demo.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Quick-Start Guide",
+        "excerpt":"本指南通过使用spark-shell简要介绍了Hudi功能。使用Spark数据源,我们将通过代码段展示如何插入和更新的Hudi默认存储类型数据集: 写时复制。每次写操作之后,我们还将展示如何读取快照和增量读取数据。 设置spark-shell Hudi适用于Spark-2.x版本。您可以按照此处的说明设置spark。 在提取的目录中,使用spark-shell运行Hudi: bin/spark-shell --packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' 设置表名、基本路径和数据生成器来为本指南生成记录。 import org.apache.hudi.QuickstartUtils._ import scala.collection.JavaConversions._ import org.apache.spark.sql. [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-quick-start-guide.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Quick-Start Guide",
+        "excerpt":"This guide provides a quick peek at Hudi’s capabilities using spark-shell. Using Spark datasources, we will walk through code snippets that allows you to insert and update a Hudi table of default table type: Copy on Write. After each write operation we will also show how to read the data...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-quick-start-guide.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Structure",
+        "excerpt":"Hudi (pronounced “Hoodie”) ingests &amp; manages storage of large analytical tables over DFS (HDFS or cloud stores) and provides three types of queries. Read Optimized query - Provides excellent query performance on pure columnar storage, much like plain Parquet tables. Incremental query - Provides a change stream out of the...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-structure.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "使用案例",
+        "excerpt":"以下是一些使用Hudi的示例,说明了加快处理速度和提高效率的好处 近实时摄取 将外部源(如事件日志、数据库、外部源)的数据摄取到Hadoop数据湖是一个众所周知的问题。 尽管这些数据对整个组织来说是最有价值的,但不幸的是,在大多数(如果不是全部)Hadoop部署中都使用零散的方式解决,即使用多个不同的摄取工具。 对于RDBMS摄取,Hudi提供 通过更新插入达到更快加载,而不是昂贵且低效的批量加载。例如,您可以读取MySQL BIN日志或Sqoop增量导入并将其应用于 DFS上的等效Hudi表。这比批量合并任务及复杂的手工合并工作流更快/更有效率。 对于NoSQL数据存储,如Cassandra / Voldemort / HBase,即使是中等规模大小也会存储数十亿行。 毫无疑问, 全量加载不可行,如果摄取需要跟上较高的更新量,那么则需要更有效的方法。 即使对于像Kafka
 样的不可变数据源,Hudi也可以 强制在HDFS上使用最小文件大小, 这采取了综合方式解决HDFS小文件问题来改善NameNode的健康状况。这对事件流来说更为 [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-use_cases.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Use Cases",
+        "excerpt":"Near Real-Time Ingestion Ingesting data from external sources like (event logs, databases, external sources) into a Hadoop Data Lake is a well known problem. In most (if not all) Hadoop deployments, it is unfortunately solved in a piecemeal fashion, using a medley of ingestion tools, even though this data is...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-use_cases.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "演讲 & Hudi 用户",
+        "excerpt":"已使用 Uber Hudi最初由Uber开发,用于实现低延迟、高效率的数据库摄取。 Hudi自2016年8月开始在生产环境上线,在Hadoop上驱动约100个非常关键的业务表,支撑约几百TB的数据规模(前10名包括行程、乘客、司机)。 Hudi还支持几个增量的Hive ETL管道,并且目前已集成到Uber的数据分发系统中。 EMIS Health EMIS Health是英国最大的初级保健IT软件提供商,其数据集包括超过5000亿的医疗保健记录。HUDI用于管理生产中的分析数据集,并使其与上游源保持同步。Presto用于查询以HUDI格式写入的数据。 Yields.io Yields.io是第一个使用AI在企业范围内进行自动模型验证和实时监控的金融科技平台。他们的数据湖由Hudi管理,他们还积极使用Hudi为增量式、跨语言/平台机器学习构建基础架构。 Yotpo Hudi在Yotpo有不少用途。首先,在他们的开源ETL框架中集成了Hudi作为CDC
 道的输出写入程序,即从数据库binlog生成的事件流到Kafka然后再写入S3。 演 [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-powered_by.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Talks & Powered By",
+        "excerpt":"Adoption Uber Apache Hudi was originally developed at Uber, to achieve low latency database ingestion, with high efficiency. It has been in production since Aug 2016, powering the massive 100PB data lake, including highly business critical tables like core trips,riders,partners. It also powers several incremental Hive ETL pipelines and being...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-powered_by.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "对比",
+        "excerpt":"Apache Hudi填补了在DFS上处理数据的巨大空白,并可以和这些技术很好地共存。然而, 通过将Hudi与一些相关系统进行对比,来了解Hudi如何适应当前的大数据生态系统,并知晓这些系统在设计中做的不同权衡仍将非常有用。   Kudu   Apache Kudu是一个与Hudi具有相似目标的存储系统,该系统通过对upserts支持来对PB级数据进行实时分析。 一个关键的区别是Kudu还试图充当OLTP工作负载的数据存储,而Hudi并不希望这样做。 因此,Kudu不支持增量拉取(截至2017年初),而Hudi支持以便进行增量处理。   Kudu与分布式文件系统抽象和HDFS完全不同,它自己的一组存储服务器通过RAFT相互通信。 与之不同的是,Hudi旨在与底层Hadoop兼容的文件系统(HDFS,S3或Ceph)一起使用,并且没有自己的存储服务器群,而是依靠Apache Spark来完成繁重的工作。 因此,Hudi可以像其他S
 park作业一样轻松扩展,而Kudu则需要硬件和运营支持,特别是HBase或Vertica等数据存储系统。 [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-comparison.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Comparison",
+        "excerpt":"Apache Hudi fills a big void for processing data on top of DFS, and thus mostly co-exists nicely with these technologies. However, it would be useful to understand how Hudi fits into the current big data ecosystem, contrasting it with a few related systems and bring out the different tradeoffs...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-comparison.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "概念",
+        "excerpt":"Apache Hudi(发音为“Hudi”)在DFS的数据集上提供以下流原语 插入更新 (如何改变数据集?) 增量拉取 (如何获取变更的数据?) 在本节中,我们将讨论重要的概念和术语,这些概念和术语有助于理解并有效使用这些原语。 时间轴 在它的核心,Hudi维护一条包含在不同的即时时间所有对数据集操作的时间轴,从而提供,从不同时间点出发得到不同的视图下的数据集。Hudi即时包含以下组件 操作类型 : 对数据集执行的操作类型 即时时间 : 即时时间通常是一个时间戳(例如:20190117010349),该时间戳按操作开始时间的顺序单调增加。 状态 : 即时的状态 Hudi保证在时间轴上执行的操作的原子性和基于即时时间的时间轴一致性。 执行的关键操作包括 COMMITS - 一次提交表示将一组记录原子写入到数据集中。 CLEANS - 删除数据集中不再需要的旧文件版本的后台
 活动。 DELTA_COMMIT - 增量提交是指将一批记录原子写入到MergeOnRead存储类型的数据集中,其中一些/所有数据都可以只写到增量日志中。 COMPA [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-concepts.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Concepts",
+        "excerpt":"Apache Hudi (pronounced “Hudi”) provides the following streaming primitives over hadoop compatible storages Update/Delete Records (how do I change records in a table?) Change Streams (how do I fetch records that changed?) In this section, we will discuss key concepts &amp; terminologies that are important to understand, to be able...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-concepts.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "写入 Hudi 数据集",
+        "excerpt":"这一节我们将介绍使用DeltaStreamer工具从外部源甚至其他Hudi数据集摄取新更改的方法, 以及通过使用Hudi数据源的upserts加快大型Spark作业的方法。 对于此类数据集,我们可以使用各种查询引擎查询它们。 写操作 在此之前,了解Hudi数据源及delta streamer工具提供的三种不同的写操作以及如何最佳利用它们可能会有所帮助。 这些操作可以在针对数据集发出的每个提交/增量提交中进行选择/更改。 UPSERT(插入更新) :这是默认操作,在该操作中,通过查找索引,首先将输入记录标记为插入或更新。 在运行启发式方法以确定如何最好地将这些记录放到存储上,如优化文件大小之类后,这些记录最终会被写入。 对于诸如数据库更改捕获之类的用例,建议该操作,因为输入几乎肯定包含更新。 INSERT(插入) :就使用启发式方法确定文件大小
 言,此操作与插入更新(UPSERT)非常相似,但此操作完全跳过了索引查找步骤。 因此,对于日志重复数据删除等用例(结合下面提到的过滤重复项的选项),它可以比插入更新快得多。 插入也适用于这种用 [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-writing_data.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Writing Hudi Tables",
+        "excerpt":"In this section, we will cover ways to ingest new changes from external sources or even other Hudi tables using the DeltaStreamer tool, as well as speeding up large Spark jobs via upserts using the Hudi datasource. Such tables can then be queried using various query engines. Write Operations Before...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-writing_data.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "查询 Hudi 数据集",
+        "excerpt":"从概念上讲,Hudi物理存储一次数据到DFS上,同时在其上提供三个逻辑视图,如之前所述。 数据集同步到Hive Metastore后,它将提供由Hudi的自定义输入格式支持的Hive外部表。一旦提供了适当的Hudi捆绑包, 就可以通过Hive、Spark和Presto之类的常用查询引擎来查询数据集。 具体来说,在写入过程中传递了两个由table name命名的Hive表。 例如,如果table name = hudi_tbl,我们得到 hudi_tbl 实现了由 HoodieParquetInputFormat 支持的数据集的读优化视图,从而提供了纯列式数据。 hudi_tbl_rt 实现了由 HoodieParquetRealtimeInputFormat 支持的数据集的实时视图,从而提供了基础数据和日志数据的合并视图。 如概念部分所述,增量处理所需要的 一个关键原语是增量拉取(以从数据集中获取更改流/日志)。您可以增量提取Hudi数据集,这意味着自指定的即时时间起, 您可
 只获得全部更新和新行。 这与插入更新一起使用,对于构建某 [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-querying_data.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Querying Hudi Tables",
+        "excerpt":"Conceptually, Hudi stores data physically once on DFS, while providing 3 different ways of querying, as explained before. Once the table is synced to the Hive metastore, it provides external Hive tables backed by Hudi’s custom inputformats. Once the proper hudi bundle has been installed, the table can be queried...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-querying_data.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "配置",
+        "excerpt":"该页面介绍了几种配置写入或读取Hudi数据集的作业的方法。 简而言之,您可以在几个级别上控制行为。 Spark数据源配置 : 这些配置控制Hudi Spark数据源,提供如下功能: 定义键和分区、选择写操作、指定如何合并记录或选择要读取的视图类型。 WriteClient 配置 : 在内部,Hudi数据源使用基于RDD的HoodieWriteClient API 真正执行对存储的写入。 这些配置可对文件大小、压缩(compression)、并行度、压缩(compaction)、写入模式、清理等底层方面进行完全控制。 尽管Hudi提供了合理的默认设置,但在不同情形下,可能需要对这些配置进行调整以针对特定的工作负载进行优化。 RecordPayload 配置 : 这是Hudi提供的最底层的定制。 RecordPayload定义了如何根据传入的新记录和存储的旧记录来产生新值以进行插入更新。 Hudi提供了诸如OverwriteWithLatestAvroPayload的
 默认实现,该实现仅使用最新或最后写入的记录来更新存储。 在数据源和Wr [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-configurations.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Configurations",
+        "excerpt":"This page covers the different ways of configuring your job to write/read Hudi tables. At a high level, you can control behaviour at few levels. Spark Datasource Configs : These configs control the Hudi Spark Datasource, providing ability to define keys/partitioning, pick out the write operation, specify how to merge...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-configurations.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "性能",
+        "excerpt":"在本节中,我们将介绍一些有关Hudi插入更新、增量提取的实际性能数据,并将其与实现这些任务的其它传统工具进行比较。   插入更新   下面显示了从NoSQL数据库摄取获得的速度提升,这些速度提升数据是通过在写入时复制存储上的Hudi数据集上插入更新而获得的, 数据集包括5个从小到大的表(相对于批量加载表)。           由于Hudi可以通过增量构建数据集,它也为更频繁地调度摄取提供了可能性,从而减少了延迟,并显著节省了总体计算成本。           Hudi插入更新在t1表的一次提交中就进行了高达4TB的压力测试。 有关一些调优技巧,请参见这里。   索引   为了有效地插入更新数据,Hudi需要将要写入的批量数据中的记录分类为插入和更新(并标记它所属的文件组)。 为了加快此操作的速度,Hudi采用了可插拔索引机制,该机制存储
 了recordKey和它所属的文件组ID之间的映射。 默认情况下,Hudi使用内置索引,该索引使用文件范围和布隆过滤器来完成此任务,相比于Spark Join,其速度最高可提高10倍。   当您将r [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-performance.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Performance",
+        "excerpt":"In this section, we go over some real world performance numbers for Hudi upserts, incremental pull and compare them against the conventional alternatives for achieving these tasks. Upserts Following shows the speed up obtained for NoSQL database ingestion, from incrementally upserting on a Hudi table on the copy-on-write storage, on...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-performance.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "管理 Hudi Pipelines",
+        "excerpt":"管理员/运维人员可以通过以下方式了解Hudi数据集/管道 通过Admin CLI进行管理 Graphite指标 Hudi应用程序的Spark UI 本节简要介绍了每一种方法,并提供了有关故障排除的一些常规指南 Admin CLI 一旦构建了hudi,就可以通过cd hudi-cli &amp;&amp; ./hudi-cli.sh启动shell。 一个hudi数据集位于DFS上的basePath位置,我们需要该位置才能连接到Hudi数据集。 Hudi库使用.hoodie子文件夹跟踪所有元数据,从而有效地在内部管理该数据集。 初始化hudi表,可使用如下命令。 18/09/06 15:56:52 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring ========================================== [...]
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-deployment.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Deployment Guide",
+        "excerpt":"This section provides all the help you need to deploy and operate Hudi tables at scale. Specifically, we will cover the following aspects. Deployment Model : How various Hudi components are deployed and managed. Upgrading Versions : Picking up new releases of Hudi, guidelines and general best-practices. Migrating to Hudi...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-deployment.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Privacy Policy",
+        "excerpt":"Information about your use of this website is collected using server access logs and a tracking cookie. The collected information consists of the following: The IP address from which you access the website; The type of browser and operating system you use to access our site; The date and time...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-privacy.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Privacy Policy",
+        "excerpt":"Information about your use of this website is collected using server access logs and a tracking cookie. The collected information consists of the following: The IP address from which you access the website; The type of browser and operating system you use to access our site; The date and time...","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-privacy.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "文档版本",
+        "excerpt":"                                  Latest             英文版             中文版                                      0.5.1             英文版             中文版                                      0.5.0             英文版             中文版                        ","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/cn/docs/0.5.1-docs-versions.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "Docs Versions",
+        "excerpt":"                                  Latest             English Version             Chinese Version                                      0.5.1             English Version             Chinese Version                                      0.5.0             English Version             Chinese Version                       ","categories": [],
+        "tags": [],
+        "url": "http://0.0.0.0:4000/docs/0.5.1-docs-versions.html",
+        "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
+        "title": "S3 Filesystem",
+        "excerpt":"In this page, we explain how to get your Hudi spark job to store into AWS S3. AWS configs There are two configurations required for Hudi-S3 compatibility: Adding AWS Credentials for Hudi Adding required Jars to classpath AWS Credentials Simplest way to use Hudi with S3, is to configure your...","categories": [],
+        "tags": [],
         "url": "http://0.0.0.0:4000/cn/docs/s3_hoodie.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "S3 Filesystem",
@@ -275,7 +440,7 @@ var store = [{
         "url": "http://0.0.0.0:4000/cn/docs/querying_data.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "Querying Hudi Tables",
-        "excerpt":"Conceptually, Hudi stores data physically once on DFS, while providing 3 different ways of querying, as explained before. Once the table is synced to the Hive metastore, it provides external Hive tables backed by Hudi’s custom inputformats. Once the proper hudi bundle has been provided, the table can be queried...","categories": [],
+        "excerpt":"Conceptually, Hudi stores data physically once on DFS, while providing 3 different ways of querying, as explained before. Once the table is synced to the Hive metastore, it provides external Hive tables backed by Hudi’s custom inputformats. Once the proper hudi bundle has been installed, the table can be queried...","categories": [],
         "tags": [],
         "url": "http://0.0.0.0:4000/docs/querying_data.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
@@ -320,12 +485,12 @@ var store = [{
         "url": "http://0.0.0.0:4000/docs/privacy.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "文档版本",
-        "excerpt":"                                  Latest             英文版             中文版                                      0.5.0             英文版             中文版                        ","categories": [],
+        "excerpt":"                                  Latest             英文版             中文版                                      0.5.1             英文版             中文版                                      0.5.0             英文版             中文版                        ","categories": [],
         "tags": [],
         "url": "http://0.0.0.0:4000/cn/docs/docs-versions.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
         "title": "Docs Versions",
-        "excerpt":"                                  Latest             English Version             Chinese Version                                      0.5.0             English Version             Chinese Version                       ","categories": [],
+        "excerpt":"                                  Latest             English Version             Chinese Version                                      0.5.1             English Version             Chinese Version                                      0.5.0             English Version             Chinese Version                       ","categories": [],
         "tags": [],
         "url": "http://0.0.0.0:4000/docs/docs-versions.html",
         "teaser":"http://0.0.0.0:4000/assets/images/500x300.png"},{
diff --git a/content/cn/activity.html b/content/cn/activity.html
index 1de0061..b48ebc5 100644
--- a/content/cn/activity.html
+++ b/content/cn/activity.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
diff --git a/content/cn/community.html b/content/cn/community.html
index e95e726..55c85e2 100644
--- a/content/cn/community.html
+++ b/content/cn/community.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
diff --git a/content/cn/contributing.html b/content/cn/contributing.html
index c72e171..e053949 100644
--- a/content/cn/contributing.html
+++ b/content/cn/contributing.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
@@ -307,7 +311,14 @@ so that the community can contribute at large and help implement it much quickly
   <li>Once you finalize on a project/task, please open a new JIRA or assign an existing one to yourself.
     <ul>
       <li>Almost all PRs should be linked to a JIRA. It’s always good to have a JIRA upfront to avoid duplicating efforts.</li>
-      <li>If the changes are minor, then <code class="highlighter-rouge">[MINOR]</code> prefix can be added to Pull Request title without a JIRA.</li>
+      <li>If the changes are minor, then <code class="highlighter-rouge">[MINOR]</code> prefix can be added to Pull Request title without a JIRA. Below are some tips to judge <strong>MINOR</strong> Pull Request :
+        <ul>
+          <li>trivial fixes (for example, a typo, a broken link, intellisense or an obvious error)</li>
+          <li>the change does not alter functionality or performance in any way</li>
+          <li>changed lines less than 100</li>
+          <li>obviously judge that the PR would pass without waiting for CI / CD verification</li>
+        </ul>
+      </li>
       <li>But, you may be asked to file a JIRA, if reviewer deems it necessary</li>
     </ul>
   </li>
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.0-docs-versions.html
index da9849f..1d5320b 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.0-docs-versions.html
@@ -4,7 +4,7 @@
     <meta charset="utf-8">
 
 <!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<meta name="description" content="                              Latest            英文版            中文版                                  0.5.1            英文版            中文版                                  0.5.0            英文版            中文版                  ">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
@@ -13,7 +13,7 @@
 <meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.1            英文版            中文版                                  0.5.0            英文版            中文版                  ">
 
 
 
@@ -347,6 +347,12 @@
         </tr>
       
         <tr>
+            <th>0.5.1</th>
+            <td><a href="/docs/0.5.1-quick-start-guide.html">英文版</a></td>
+            <td><a href="/cn/docs/0.5.1-quick-start-guide.html">中文版</a></td>
+        </tr>
+      
+        <tr>
             <th>0.5.0</th>
             <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
             <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-comparison.html
similarity index 51%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-comparison.html
index da9849f..cab1258 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-comparison.html
@@ -3,17 +3,17 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<!-- begin _includes/seo.html --><title>对比 - Apache Hudi</title>
+<meta name="description" content="Apache Hudi填补了在DFS上处理数据的巨大空白,并可以和这些技术很好地共存。然而,通过将Hudi与一些相关系统进行对比,来了解Hudi如何适应当前的大数据生态系统,并知晓这些系统在设计中做的不同权衡仍将非常有用。">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:title" content="对比">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-comparison.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="Apache Hudi填补了在DFS上处理数据的巨大空白,并可以和这些技术很好地共存。然而,通过将Hudi与一些相关系统进行对比,来了解Hudi如何适应当前的大数据生态系统,并知晓这些系统在设计中做的不同权衡仍将非常有用。">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="active">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -324,7 +324,7 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">文档版本
+          <h1 id="page-title" class="page__title" itemprop="headline">对比
 </h1>
         </header>
       
@@ -337,23 +337,47 @@
             }
           </style>
         
-        <table class="docversions">
-    <tbody>
-      
-        <tr>
-            <th>Latest</th>
-            <td><a href="/docs/quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-        <tr>
-            <th>0.5.0</th>
-            <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-    </tbody>
-</table>
+        <p>Apache Hudi填补了在DFS上处理数据的巨大空白,并可以和这些技术很好地共存。然而,
+通过将Hudi与一些相关系统进行对比,来了解Hudi如何适应当前的大数据生态系统,并知晓这些系统在设计中做的不同权衡仍将非常有用。</p>
+
+<h2 id="kudu">Kudu</h2>
+
+<p><a href="https://kudu.apache.org">Apache Kudu</a>是一个与Hudi具有相似目标的存储系统,该系统通过对<code class="highlighter-rouge">upserts</code>支持来对PB级数据进行实时分析。
+一个关键的区别是Kudu还试图充当OLTP工作负载的数据存储,而Hudi并不希望这样做。
+因此,Kudu不支持增量拉取(截至2017年初),而Hudi支持以便进行增量处理。</p>
+
+<p>Kudu与分布式文件系统抽象和HDFS完全不同,它自己的一组存储服务器通过RAFT相互通信。
+与之不同的是,Hudi旨在与底层Hadoop兼容的文件系统(HDFS,S3或Ceph)一起使用,并且没有自己的存储服务器群,而是依靠Apache Spark来完成繁重的工作。
+因此,Hudi可以像其他Spark作业一样轻松扩展,而Kudu则需要硬件和运营支持,特别是HBase或Vertica等数据存储系统。
+到目前为止,我们还没有做任何直接的基准测试来比较Kudu和Hudi(鉴于RTTable正在进行中)。
+但是,如果我们要使用<a href="https://db-blog.web.cern.ch/blog/zbigniew-baranowski/2017-01-performance-comparison-different-file-formats-and-storage-engines">CERN</a>,
+我们预期Hudi在摄取parquet上有更卓越的性能。</p>
+
+<h2 id="hive事务">Hive事务</h2>
+
+<p><a href="https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions">Hive事务/ACID</a>是另一项类似的工作,它试图实现在ORC文件格式之上的存储<code class="highlighter-rouge">读取时合并</code>。
+可以理解,此功能与Hive以及<a href="https://cwiki.apache.org/confluence/display/Hive/LLAP">LLAP</a>之类的其他工作紧密相关。
+Hive事务不提供Hudi提供的读取优化存储选项或增量拉取。
+在实现选择方面,Hudi充分利用了类似Spark的处理框架的功能,而Hive事务特性则在用户或Hive Metastore启动的Hive任务/查询的下实现。
+根据我们的生产经验,与其他方法相比,将Hudi作为库嵌入到现有的Spark管道中要容易得多,并且操作不会太繁琐。
+Hudi还设计用于与Presto/Spark等非Hive引擎合作,并计划引入除parquet以外的文件格式。</p>
+
+<h2 id="hbase">HBase</h2>
+
+<p>尽管<a href="https://hbase.apache.org">HBase</a>最终是OLTP工作负载的键值存储层,但由于与Hadoop的相似性,用户通常倾向于将HBase与分析相关联。
+鉴于HBase经过严格的写优化,它支持开箱即用的亚秒级更新,Hive-on-HBase允许用户查询该数据。 但是,就分析工作负载的实际性能而言,Parquet/ORC之类的混合列式存储格式可以轻松击败HBase,因为这些工作负载主要是读取繁重的工作。
+Hudi弥补了更快的数据与分析存储格式之间的差距。从运营的角度来看,与管理分析使用的HBase region服务器集群相比,为用户提供可更快给出数据的库更具可扩展性。
+最终,HBase不像Hudi这样重点支持<code class="highlighter-rouge">提交时间</code>、<code class="highlighter-rouge">增量拉取</code>之类的增量处理原语。</p>
+
+<h2 id="流式处理">流式处理</h2>
+
+<p>一个普遍的问题:”Hudi与流处理系统有何关系?”,我们将在这里尝试回答。简而言之,Hudi可以与当今的批处理(<code class="highlighter-rouge">写时复制存储</code>)和流处理(<code class="highlighter-rouge">读时合并存储</code>)作业集成,以将计算结果存储在Hadoop中。
+对于Spark应用程序,这可以通过将Hudi库与Spark/Spark流式DAG直接集成来实现。在非Spark处理系统(例如Flink、Hive)情况下,可以在相应的系统中进行处理,然后通过Kafka主题/DFS中间文件将其发送到Hudi表中。从概念上讲,数据处理
+管道仅由三个部分组成:<code class="highlighter-rouge">输入</code>,<code class="highlighter-rouge">处理</code>,<code class="highlighter-rouge">输出</code>,用户最终针对输出运行查询以便使用管道的结果。Hudi可以充当将数据存储在DFS上的输入或输出。Hudi在给定流处理管道上的适用性最终归结为你的查询在Presto/SparkSQL/Hive的适用性。</p>
+
+<p>更高级的用例围绕<a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">增量处理</a>的概念展开,
+甚至在<code class="highlighter-rouge">处理</code>引擎内部也使用Hudi来加速典型的批处理管道。例如:Hudi可用作DAG内的状态存储(类似Flink使用的[rocksDB(https://ci.apache.org/projects/flink/flink-docs-release-1.2/ops/state_backends.html#the-rocksdbstatebackend))。
+这是路线图上的一个项目并将最终以<a href="https://issues.apache.org/jira/browse/HUDI-60">Beam Runner</a>的形式呈现。</p>
 
       </section>
 
diff --git a/content/cn/docs/0.5.1-concepts.html b/content/cn/docs/0.5.1-concepts.html
new file mode 100644
index 0000000..aab4573
--- /dev/null
+++ b/content/cn/docs/0.5.1-concepts.html
@@ -0,0 +1,613 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>概念 - Apache Hudi</title>
+<meta name="description" content="Apache Hudi(发音为“Hudi”)在DFS的数据集上提供以下流原语">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="概念">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-concepts.html">
+
+
+  <meta property="og:description" content="Apache Hudi(发音为“Hudi”)在DFS的数据集上提供以下流原语">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="active">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">概念
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#时间轴">时间轴</a></li>
+  <li><a href="#文件组织">文件组织</a></li>
+  <li><a href="#存储类型和视图">存储类型和视图</a>
+    <ul>
+      <li><a href="#存储类型">存储类型</a></li>
+      <li><a href="#视图">视图</a></li>
+    </ul>
+  </li>
+  <li><a href="#copy-on-write-storage">写时复制存储</a></li>
+  <li><a href="#merge-on-read-storage">读时合并存储</a></li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>Apache Hudi(发音为“Hudi”)在DFS的数据集上提供以下流原语</p>
+
+<ul>
+  <li>插入更新           (如何改变数据集?)</li>
+  <li>增量拉取           (如何获取变更的数据?)</li>
+</ul>
+
+<p>在本节中,我们将讨论重要的概念和术语,这些概念和术语有助于理解并有效使用这些原语。</p>
+
+<h2 id="时间轴">时间轴</h2>
+<p>在它的核心,Hudi维护一条包含在不同的<code class="highlighter-rouge">即时</code>时间所有对数据集操作的<code class="highlighter-rouge">时间轴</code>,从而提供,从不同时间点出发得到不同的视图下的数据集。Hudi即时包含以下组件</p>
+
+<ul>
+  <li><code class="highlighter-rouge">操作类型</code> : 对数据集执行的操作类型</li>
+  <li><code class="highlighter-rouge">即时时间</code> : 即时时间通常是一个时间戳(例如:20190117010349),该时间戳按操作开始时间的顺序单调增加。</li>
+  <li><code class="highlighter-rouge">状态</code> : 即时的状态</li>
+</ul>
+
+<p>Hudi保证在时间轴上执行的操作的原子性和基于即时时间的时间轴一致性。</p>
+
+<p>执行的关键操作包括</p>
+
+<ul>
+  <li><code class="highlighter-rouge">COMMITS</code> - 一次提交表示将一组记录<strong>原子写入</strong>到数据集中。</li>
+  <li><code class="highlighter-rouge">CLEANS</code> - 删除数据集中不再需要的旧文件版本的后台活动。</li>
+  <li><code class="highlighter-rouge">DELTA_COMMIT</code> - 增量提交是指将一批记录<strong>原子写入</strong>到MergeOnRead存储类型的数据集中,其中一些/所有数据都可以只写到增量日志中。</li>
+  <li><code class="highlighter-rouge">COMPACTION</code> - 协调Hudi中差异数据结构的后台活动,例如:将更新从基于行的日志文件变成列格式。在内部,压缩表现为时间轴上的特殊提交。</li>
+  <li><code class="highlighter-rouge">ROLLBACK</code> - 表示提交/增量提交不成功且已回滚,删除在写入过程中产生的所有部分文件。</li>
+  <li><code class="highlighter-rouge">SAVEPOINT</code> - 将某些文件组标记为”已保存”,以便清理程序不会将其删除。在发生灾难/数据恢复的情况下,它有助于将数据集还原到时间轴上的某个点。</li>
+</ul>
+
+<p>任何给定的即时都可以处于以下状态之一</p>
+
+<ul>
+  <li><code class="highlighter-rouge">REQUESTED</code> - 表示已调度但尚未启动的操作。</li>
+  <li><code class="highlighter-rouge">INFLIGHT</code> - 表示当前正在执行该操作。</li>
+  <li><code class="highlighter-rouge">COMPLETED</code> - 表示在时间轴上完成了该操作。</li>
+</ul>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_timeline.png" alt="hudi_timeline.png" />
+</figure>
+
+<p>上面的示例显示了在Hudi数据集上大约10:00到10:20之间发生的更新事件,大约每5分钟一次,将提交元数据以及其他后台清理/压缩保留在Hudi时间轴上。
+观察的关键点是:提交时间指示数据的<code class="highlighter-rouge">到达时间</code>(上午10:20),而实际数据组织则反映了实际时间或<code class="highlighter-rouge">事件时间</code>,即数据所反映的(从07:00开始的每小时时段)。在权衡数据延迟和完整性时,这是两个关键概念。</p>
+
+<p>如果有延迟到达的数据(事件时间为9:00的数据在10:20达到,延迟 &gt;1 小时),我们可以看到upsert将新数据生成到更旧的时间段/文件夹中。
+在时间轴的帮助下,增量查询可以只提取10:00以后成功提交的新数据,并非常高效地只消费更改过的文件,且无需扫描更大的文件范围,例如07:00后的所有时间段。</p>
+
+<h2 id="文件组织">文件组织</h2>
+<p>Hudi将DFS上的数据集组织到<code class="highlighter-rouge">基本路径</code>下的目录结构中。数据集分为多个分区,这些分区是包含该分区的数据文件的文件夹,这与Hive表非常相似。
+每个分区被相对于基本路径的特定<code class="highlighter-rouge">分区路径</code>区分开来。</p>
+
+<p>在每个分区内,文件被组织为<code class="highlighter-rouge">文件组</code>,由<code class="highlighter-rouge">文件id</code>唯一标识。
+每个文件组包含多个<code class="highlighter-rouge">文件切片</code>,其中每个切片包含在某个提交/压缩即时时间生成的基本列文件(<code class="highlighter-rouge">*.parquet</code>)以及一组日志文件(<code class="highlighter-rouge">*.log*</code>),该文件包含自生成基本文件以来对基本文件的插入/更新。
+Hudi采用MVCC设计,其中压缩操作将日志和基本文件合并以产生新的文件片,而清理操作则将未使用的/较旧的文件片删除以回收DFS上的空间。</p>
+
+<p>Hudi通过索引机制将给定的hoodie键(记录键+分区路径)映射到文件组,从而提供了高效的Upsert。
+一旦将记录的第一个版本写入文件,记录键和文件组/文件id之间的映射就永远不会改变。 简而言之,映射的文件组包含一组记录的所有版本。</p>
+
+<h2 id="存储类型和视图">存储类型和视图</h2>
+<p>Hudi存储类型定义了如何在DFS上对数据进行索引和布局以及如何在这种组织之上实现上述原语和时间轴活动(即如何写入数据)。
+反过来,<code class="highlighter-rouge">视图</code>定义了基础数据如何暴露给查询(即如何读取数据)。</p>
+
+<table>
+  <thead>
+    <tr>
+      <th>存储类型</th>
+      <th>支持的视图</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>写时复制</td>
+      <td>读优化 + 增量</td>
+    </tr>
+    <tr>
+      <td>读时合并</td>
+      <td>读优化 + 增量 + 近实时</td>
+    </tr>
+  </tbody>
+</table>
+
+<h3 id="存储类型">存储类型</h3>
+<p>Hudi支持以下存储类型。</p>
+
+<ul>
+  <li>
+    <p><a href="#copy-on-write-storage">写时复制</a> : 仅使用列文件格式(例如parquet)存储数据。通过在写入过程中执行同步合并以更新版本并重写文件。</p>
+  </li>
+  <li>
+    <p><a href="#merge-on-read-storage">读时合并</a> : 使用列式(例如parquet)+ 基于行(例如avro)的文件格式组合来存储数据。 更新记录到增量文件中,然后进行同步或异步压缩以生成列文件的新版本。</p>
+  </li>
+</ul>
+
+<p>下表总结了这两种存储类型之间的权衡</p>
+
+<table>
+  <thead>
+    <tr>
+      <th>权衡</th>
+      <th>写时复制</th>
+      <th>读时合并</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>数据延迟</td>
+      <td>更高</td>
+      <td>更低</td>
+    </tr>
+    <tr>
+      <td>更新代价(I/O)</td>
+      <td>更高(重写整个parquet文件)</td>
+      <td>更低(追加到增量日志)</td>
+    </tr>
+    <tr>
+      <td>Parquet文件大小</td>
+      <td>更小(高更新代价(I/o))</td>
+      <td>更大(低更新代价)</td>
+    </tr>
+    <tr>
+      <td>写放大</td>
+      <td>更高</td>
+      <td>更低(取决于压缩策略)</td>
+    </tr>
+  </tbody>
+</table>
+
+<h3 id="视图">视图</h3>
+<p>Hudi支持以下存储数据的视图</p>
+
+<ul>
+  <li><strong>读优化视图</strong> : 在此视图上的查询将查看给定提交或压缩操作中数据集的最新快照。
+ 该视图仅将最新文件切片中的基本/列文件暴露给查询,并保证与非Hudi列式数据集相比,具有相同的列式查询性能。</li>
+  <li><strong>增量视图</strong> : 对该视图的查询只能看到从某个提交/压缩后写入数据集的新数据。该视图有效地提供了更改流,来支持增量数据管道。</li>
+  <li><strong>实时视图</strong> : 在此视图上的查询将查看某个增量提交操作中数据集的最新快照。该视图通过动态合并最新的基本文件(例如parquet)和增量文件(例如avro)来提供近实时数据集(几分钟的延迟)。</li>
+</ul>
+
+<p>下表总结了不同视图之间的权衡。</p>
+
+<table>
+  <thead>
+    <tr>
+      <th>权衡</th>
+      <th>读优化</th>
+      <th>实时</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>数据延迟</td>
+      <td>更高</td>
+      <td>更低</td>
+    </tr>
+    <tr>
+      <td>查询延迟</td>
+      <td>更低(原始列式性能)</td>
+      <td>更高(合并列式 + 基于行的增量)</td>
+    </tr>
+  </tbody>
+</table>
+
+<h2 id="copy-on-write-storage">写时复制存储</h2>
+
+<p>写时复制存储中的文件片仅包含基本/列文件,并且每次提交都会生成新版本的基本文件。
+换句话说,我们压缩每个提交,从而所有的数据都是以列数据的形式储存。在这种情况下,写入数据非常昂贵(我们需要重写整个列数据文件,即使只有一个字节的新数据被提交),而读取数据的成本则没有增加。
+这种视图有利于读取繁重的分析工作。</p>
+
+<p>以下内容说明了将数据写入写时复制存储并在其上运行两个查询时,它是如何工作的。</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_cow.png" alt="hudi_cow.png" />
+</figure>
+
+<p>随着数据的写入,对现有文件组的更新将为该文件组生成一个带有提交即时时间标记的新切片,而插入分配一个新文件组并写入该文件组的第一个切片。
+这些文件切片及其提交即时时间在上面用颜色编码。
+针对这样的数据集运行SQL查询(例如:<code class="highlighter-rouge">select count(*)</code>统计该分区中的记录数目),首先检查时间轴上的最新提交并过滤每个文件组中除最新文件片以外的所有文件片。
+如您所见,旧查询不会看到以粉红色标记的当前进行中的提交的文件,但是在该提交后的新查询会获取新数据。因此,查询不受任何写入失败/部分写入的影响,仅运行在已提交数据上。</p>
+
+<p>写时复制存储的目的是从根本上改善当前管理数据集的方式,通过以下方法来实现</p>
+
+<ul>
+  <li>优先支持在文件级原子更新数据,而无需重写整个表/分区</li>
+  <li>能够只读取更新的部分,而不是进行低效的扫描或搜索</li>
+  <li>严格控制文件大小来保持出色的查询性能(小的文件会严重损害查询性能)。</li>
+</ul>
+
+<h2 id="merge-on-read-storage">读时合并存储</h2>
+
+<p>读时合并存储是写时复制的升级版,从某种意义上说,它仍然可以通过读优化表提供数据集的读取优化视图(写时复制的功能)。
+此外,它将每个文件组的更新插入存储到基于行的增量日志中,通过文件id,将增量日志和最新版本的基本文件进行合并,从而提供近实时的数据查询。因此,此存储类型智能地平衡了读和写的成本,以提供近乎实时的查询。
+这里最重要的一点是压缩器,它现在可以仔细挑选需要压缩到其列式基础文件中的增量日志(根据增量日志的文件大小),以保持查询性能(较大的增量日志将会提升近实时的查询时间,并同时需要更长的合并时间)。</p>
+
+<p>以下内容说明了存储的工作方式,并显示了对近实时表和读优化表的查询。</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_mor.png" alt="hudi_mor.png" style="max-width: 100%" />
+</figure>
+
+<p>此示例中发生了很多有趣的事情,这些带出了该方法的微妙之处。</p>
+
+<ul>
+  <li>现在,我们每1分钟左右就有一次提交,这是其他存储类型无法做到的。</li>
+  <li>现在,在每个文件id组中,都有一个增量日志,其中包含对基础列文件中记录的更新。
+ 在示例中,增量日志包含10:05至10:10的所有数据。与以前一样,基本列式文件仍使用提交进行版本控制。
+ 因此,如果只看一眼基本文件,那么存储布局看起来就像是写时复制表的副本。</li>
+  <li>定期压缩过程会从增量日志中合并这些更改,并生成基础文件的新版本,就像示例中10:05发生的情况一样。</li>
+  <li>有两种查询同一存储的方式:读优化(RO)表和近实时(RT)表,具体取决于我们选择查询性能还是数据新鲜度。</li>
+  <li>对于RO表来说,提交数据在何时可用于查询将有些许不同。 请注意,以10:10运行的(在RO表上的)此类查询将不会看到10:05之后的数据,而在RT表上的查询总会看到最新的数据。</li>
+  <li>何时触发压缩以及压缩什么是解决这些难题的关键。
+ 通过实施压缩策略,在该策略中,与较旧的分区相比,我们会积极地压缩最新的分区,从而确保RO表能够以一致的方式看到几分钟内发布的数据。</li>
+</ul>
+
+<p>读时合并存储上的目的是直接在DFS上启用近实时处理,而不是将数据复制到专用系统,后者可能无法处理大数据量。
+该存储还有一些其他方面的好处,例如通过避免数据的同步合并来减少写放大,即批量数据中每1字节数据需要的写入数据量。</p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/cn/docs/0.5.1-configurations.html b/content/cn/docs/0.5.1-configurations.html
new file mode 100644
index 0000000..78adf38
--- /dev/null
+++ b/content/cn/docs/0.5.1-configurations.html
@@ -0,0 +1,860 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>配置 - Apache Hudi</title>
+<meta name="description" content="该页面介绍了几种配置写入或读取Hudi数据集的作业的方法。简而言之,您可以在几个级别上控制行为。">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="配置">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-configurations.html">
+
+
+  <meta property="og:description" content="该页面介绍了几种配置写入或读取Hudi数据集的作业的方法。简而言之,您可以在几个级别上控制行为。">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="active">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">配置
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#与云存储连接">与云存储连接</a></li>
+  <li><a href="#spark-datasource">Spark数据源配置</a>
+    <ul>
+      <li><a href="#写选项">写选项</a></li>
+      <li><a href="#读选项">读选项</a></li>
+    </ul>
+  </li>
+  <li><a href="#writeclient-configs">WriteClient 配置</a>
+    <ul>
+      <li><a href="#索引配置">索引配置</a></li>
+      <li><a href="#存储选项">存储选项</a></li>
+      <li><a href="#压缩配置">压缩配置</a></li>
+      <li><a href="#指标配置">指标配置</a></li>
+      <li><a href="#内存配置">内存配置</a></li>
+    </ul>
+  </li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>该页面介绍了几种配置写入或读取Hudi数据集的作业的方法。
+简而言之,您可以在几个级别上控制行为。</p>
+
+<ul>
+  <li><strong><a href="#spark-datasource">Spark数据源配置</a></strong> : 这些配置控制Hudi Spark数据源,提供如下功能:
+ 定义键和分区、选择写操作、指定如何合并记录或选择要读取的视图类型。</li>
+  <li><strong><a href="#writeclient-configs">WriteClient 配置</a></strong> : 在内部,Hudi数据源使用基于RDD的<code class="highlighter-rouge">HoodieWriteClient</code> API
+ 真正执行对存储的写入。 这些配置可对文件大小、压缩(compression)、并行度、压缩(compaction)、写入模式、清理等底层方面进行完全控制。
+ 尽管Hudi提供了合理的默认设置,但在不同情形下,可能需要对这些配置进行调整以针对特定的工作负载进行优化。</li>
+  <li><strong><a href="#PAYLOAD_CLASS_OPT_KEY">RecordPayload 配置</a></strong> : 这是Hudi提供的最底层的定制。
+ RecordPayload定义了如何根据传入的新记录和存储的旧记录来产生新值以进行插入更新。
+ Hudi提供了诸如<code class="highlighter-rouge">OverwriteWithLatestAvroPayload</code>的默认实现,该实现仅使用最新或最后写入的记录来更新存储。
+ 在数据源和WriteClient级别,都可以将其重写为扩展<code class="highlighter-rouge">HoodieRecordPayload</code>类的自定义类。</li>
+</ul>
+
+<h2 id="与云存储连接">与云存储连接</h2>
+
+<p>无论使用RDD/WriteClient API还是数据源,以下信息都有助于配置对云存储的访问。</p>
+
+<ul>
+  <li><a href="/cn/docs/0.5.1-s3_hoodie.html">AWS S3</a> <br />
+S3和Hudi协同工作所需的配置。</li>
+  <li><a href="/cn/docs/0.5.1-gcs_hoodie.html">Google Cloud Storage</a> <br />
+GCS和Hudi协同工作所需的配置。</li>
+</ul>
+
+<h2 id="spark-datasource">Spark数据源配置</h2>
+
+<p>可以通过将以下选项传递到<code class="highlighter-rouge">option(k,v)</code>方法中来配置使用数据源的Spark作业。
+实际的数据源级别配置在下面列出。</p>
+
+<h3 id="写选项">写选项</h3>
+
+<p>另外,您可以使用<code class="highlighter-rouge">options()</code>或<code class="highlighter-rouge">option(k,v)</code>方法直接传递任何WriteClient级别的配置。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">inputDF</span><span class="o">.</span><span class="na">write</span><span class="o">()</span>
+<span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">options</span><span class="o">(</span><span class="n">clientOpts</span><span class="o">)</span> <span class="c1">// 任何Hudi客户端选项都可以传入</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">RECORDKEY_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"_row_key"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PARTITIONPATH_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"partition"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PRECOMBINE_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"timestamp"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">HoodieWriteConfig</span><span class="o">.</span><span class="na">TABLE_NAME</span><span class="o">,</span> <span class="n">tableName</span><span class="o">)</span>
+<span class="o">.</span><span class="na">mode</span><span class="o">(</span><span class="nc">SaveMode</span><span class="o">.</span><span class="na">Append</span><span class="o">)</span>
+<span class="o">.</span><span class="na">save</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+</code></pre></div></div>
+
+<p>用于通过<code class="highlighter-rouge">write.format.option(...)</code>写入数据集的选项</p>
+
+<h4 id="TABLE_NAME_OPT_KEY">TABLE_NAME_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.table.name</code> [必须]<br />
+  <span style="color:grey">Hive表名,用于将数据集注册到其中。</span></p>
+
+<h4 id="OPERATION_OPT_KEY">OPERATION_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.operation</code>, 默认值:<code class="highlighter-rouge">upsert</code><br />
+  <span style="color:grey">是否为写操作进行插入更新、插入或批量插入。使用<code class="highlighter-rouge">bulkinsert</code>将新数据加载到表中,之后使用<code class="highlighter-rouge">upsert</code>或<code class="highlighter-rouge">insert</code>。
+  批量插入使用基于磁盘的写入路径来扩展以加载大量输入,而无需对其进行缓存。</span></p>
+
+<h4 id="STORAGE_TYPE_OPT_KEY">STORAGE_TYPE_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.storage.type</code>, 默认值:<code class="highlighter-rouge">COPY_ON_WRITE</code> <br />
+  <span style="color:grey">此写入的基础数据的存储类型。两次写入之间不能改变。</span></p>
+
+<h4 id="PRECOMBINE_FIELD_OPT_KEY">PRECOMBINE_FIELD_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.precombine.field</code>, 默认值:<code class="highlighter-rouge">ts</code> <br />
+  <span style="color:grey">实际写入之前在preCombining中使用的字段。
+  当两个记录具有相同的键值时,我们将使用Object.compareTo(..)从precombine字段中选择一个值最大的记录。</span></p>
+
+<h4 id="PAYLOAD_CLASS_OPT_KEY">PAYLOAD_CLASS_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.payload.class</code>, 默认值:<code class="highlighter-rouge">org.apache.hudi.OverwriteWithLatestAvroPayload</code> <br />
+  <span style="color:grey">使用的有效载荷类。如果您想在插入更新或插入时使用自己的合并逻辑,请重写此方法。
+  这将使得<code class="highlighter-rouge">PRECOMBINE_FIELD_OPT_VAL</code>设置的任何值无效</span></p>
+
+<h4 id="RECORDKEY_FIELD_OPT_KEY">RECORDKEY_FIELD_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.recordkey.field</code>, 默认值:<code class="highlighter-rouge">uuid</code> <br />
+  <span style="color:grey">记录键字段。用作<code class="highlighter-rouge">HoodieKey</code>中<code class="highlighter-rouge">recordKey</code>部分的值。
+  实际值将通过在字段值上调用.toString()来获得。可以使用点符号指定嵌套字段,例如:<code class="highlighter-rouge">a.b.c</code></span></p>
+
+<h4 id="PARTITIONPATH_FIELD_OPT_KEY">PARTITIONPATH_FIELD_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.partitionpath.field</code>, 默认值:<code class="highlighter-rouge">partitionpath</code> <br />
+  <span style="color:grey">分区路径字段。用作<code class="highlighter-rouge">HoodieKey</code>中<code class="highlighter-rouge">partitionPath</code>部分的值。
+  通过调用.toString()获得实际的值</span></p>
+
+<h4 id="KEYGENERATOR_CLASS_OPT_KEY">KEYGENERATOR_CLASS_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.keygenerator.class</code>, 默认值:<code class="highlighter-rouge">org.apache.hudi.SimpleKeyGenerator</code> <br />
+  <span style="color:grey">键生成器类,实现从输入的<code class="highlighter-rouge">Row</code>对象中提取键</span></p>
+
+<h4 id="COMMIT_METADATA_KEYPREFIX_OPT_KEY">COMMIT_METADATA_KEYPREFIX_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.commitmeta.key.prefix</code>, 默认值:<code class="highlighter-rouge">_</code> <br />
+  <span style="color:grey">以该前缀开头的选项键会自动添加到提交/增量提交的元数据中。
+  这对于与hudi时间轴一致的方式存储检查点信息很有用</span></p>
+
+<h4 id="INSERT_DROP_DUPS_OPT_KEY">INSERT_DROP_DUPS_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>, 默认值:<code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">如果设置为true,则在插入操作期间从传入DataFrame中过滤掉所有重复记录。</span></p>
+
+<h4 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, 默认值:<code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">设置为true时,将数据集注册并同步到Apache Hive Metastore</span></p>
+
+<h4 id="HIVE_DATABASE_OPT_KEY">HIVE_DATABASE_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.database</code>, 默认值:<code class="highlighter-rouge">default</code> <br />
+  <span style="color:grey">要同步到的数据库</span></p>
+
+<h4 id="HIVE_TABLE_OPT_KEY">HIVE_TABLE_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.table</code>, [Required] <br />
+  <span style="color:grey">要同步到的表</span></p>
+
+<h4 id="HIVE_USER_OPT_KEY">HIVE_USER_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.username</code>, 默认值:<code class="highlighter-rouge">hive</code> <br />
+  <span style="color:grey">要使用的Hive用户名</span></p>
+
+<h4 id="HIVE_PASS_OPT_KEY">HIVE_PASS_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.password</code>, 默认值:<code class="highlighter-rouge">hive</code> <br />
+  <span style="color:grey">要使用的Hive密码</span></p>
+
+<h4 id="HIVE_URL_OPT_KEY">HIVE_URL_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.jdbcurl</code>, 默认值:<code class="highlighter-rouge">jdbc:hive2://localhost:10000</code> <br />
+  <span style="color:grey">Hive metastore url</span></p>
+
+<h4 id="HIVE_PARTITION_FIELDS_OPT_KEY">HIVE_PARTITION_FIELDS_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_fields</code>, 默认值:<code class="highlighter-rouge"> </code> <br />
+  <span style="color:grey">数据集中用于确定Hive分区的字段。</span></p>
+
+<h4 id="HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY">HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_extractor_class</code>, 默认值:<code class="highlighter-rouge">org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor</code> <br />
+  <span style="color:grey">用于将分区字段值提取到Hive分区列中的类。</span></p>
+
+<h4 id="HIVE_ASSUME_DATE_PARTITION_OPT_KEY">HIVE_ASSUME_DATE_PARTITION_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.hive_sync.assume_date_partitioning</code>, 默认值:<code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">假设分区格式是yyyy/mm/dd</span></p>
+
+<h3 id="读选项">读选项</h3>
+
+<p>用于通过<code class="highlighter-rouge">read.format.option(...)</code>读取数据集的选项</p>
+
+<h4 id="VIEW_TYPE_OPT_KEY">VIEW_TYPE_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.view.type</code>, 默认值:<code class="highlighter-rouge">read_optimized</code> <br />
+<span style="color:grey">是否需要以某种模式读取数据,增量模式(自InstantTime以来的新数据)
+(或)读优化模式(基于列数据获取最新视图)
+(或)实时模式(基于行和列数据获取最新视图)</span></p>
+
+<h4 id="BEGIN_INSTANTTIME_OPT_KEY">BEGIN_INSTANTTIME_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.read.begin.instanttime</code>, [在增量模式下必须] <br />
+<span style="color:grey">开始增量提取数据的即时时间。这里的instanttime不必一定与时间轴上的即时相对应。
+取出以<code class="highlighter-rouge">instant_time &gt; BEGIN_INSTANTTIME</code>写入的新数据。
+例如:’20170901080000’将获取2017年9月1日08:00 AM之后写入的所有新数据。</span></p>
+
+<h4 id="END_INSTANTTIME_OPT_KEY">END_INSTANTTIME_OPT_KEY</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.datasource.read.end.instanttime</code>, 默认值:最新即时(即从开始即时获取所有新数据) <br />
+<span style="color:grey">限制增量提取的数据的即时时间。取出以<code class="highlighter-rouge">instant_time &lt;= END_INSTANTTIME</code>写入的新数据。</span></p>
+
+<h2 id="writeclient-configs">WriteClient 配置</h2>
+
+<p>直接使用RDD级别api进行编程的Jobs可以构建一个<code class="highlighter-rouge">HoodieWriteConfig</code>对象,并将其传递给<code class="highlighter-rouge">HoodieWriteClient</code>构造函数。
+HoodieWriteConfig可以使用以下构建器模式构建。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">HoodieWriteConfig</span> <span class="n">cfg</span> <span class="o">=</span> <span class="nc">HoodieWriteConfig</span><span class="o">.</span><span class="na">newBuilder</span><span class="o">()</span>
+        <span class="o">.</span><span class="na">withPath</span><span class="o">(</span><span class="n">basePath</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">forTable</span><span class="o">(</span><span class="n">tableName</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">withSchema</span><span class="o">(</span><span class="n">schemaStr</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">withProps</span><span class="o">(</span><span class="n">props</span><span class="o">)</span> <span class="c1">// 从属性文件传递原始k、v对。</span>
+        <span class="o">.</span><span class="na">withCompactionConfig</span><span class="o">(</span><span class="nc">HoodieCompactionConfig</span><span class="o">.</span><span class="na">newBuilder</span><span class="o">().</span><span class="na">withXXX</span><span class="o">(...).</span><span class="na">build</span><span class="o">())</span>
+        <span class="o">.</span><span class="na">withIndexConfig</span><span class="o">(</span><span class="nc">HoodieIndexConfig</span><span class="o">.</span><span class="na">newBuilder</span><span class="o">().</span><span class="na">withXXX</span><span class="o">(...).</span><span class="na">build</span><span class="o">())</span>
+        <span class="o">...</span>
+        <span class="o">.</span><span class="na">build</span><span class="o">();</span>
+</code></pre></div></div>
+
+<p>以下各节介绍了写配置的不同方面,并解释了最重要的配置及其属性名称和默认值。</p>
+
+<h4 id="withPath">withPath(hoodie_base_path)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.base.path</code> [必须] <br />
+<span style="color:grey">创建所有数据分区所依据的基本DFS路径。
+始终在前缀中明确指明存储方式(例如hdfs://,s3://等)。
+Hudi将有关提交、保存点、清理审核日志等的所有主要元数据存储在基本目录下的.hoodie目录中。</span></p>
+
+<h4 id="withSchema">withSchema(schema_str)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.avro.schema</code> [必须]<br />
+<span style="color:grey">这是数据集的当前读取器的avro模式(schema)。
+这是整个模式的字符串。HoodieWriteClient使用此模式传递到HoodieRecordPayload的实现,以从源格式转换为avro记录。
+在更新过程中重写记录时也使用此模式。</span></p>
+
+<h4 id="forTable">forTable(table_name)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.table.name</code> [必须] <br />
+ <span style="color:grey">数据集的表名,将用于在Hive中注册。每次运行需要相同。</span></p>
+
+<h4 id="withBulkInsertParallelism">withBulkInsertParallelism(bulk_insert_parallelism = 1500)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bulkinsert.shuffle.parallelism</code><br />
+<span style="color:grey">批量插入旨在用于较大的初始导入,而此处的并行度决定了数据集中文件的初始数量。
+调整此值以达到在初始导入期间所需的最佳尺寸。</span></p>
+
+<h4 id="withParallelism">withParallelism(insert_shuffle_parallelism = 1500, upsert_shuffle_parallelism = 1500)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.insert.shuffle.parallelism</code>, <code class="highlighter-rouge">hoodie.upsert.shuffle.parallelism</code><br />
+<span style="color:grey">最初导入数据后,此并行度将控制用于读取输入记录的初始并行度。
+确保此值足够高,例如:1个分区用于1 GB的输入数据</span></p>
+
+<h4 id="combineInput">combineInput(on_insert = false, on_update=true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.combine.before.insert</code>, <code class="highlighter-rouge">hoodie.combine.before.upsert</code><br />
+<span style="color:grey">在DFS中插入或更新之前先组合输入RDD并将多个部分记录合并为单个记录的标志</span></p>
+
+<h4 id="withWriteStatusStorageLevel">withWriteStatusStorageLevel(level = MEMORY_AND_DISK_SER)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.write.status.storage.level</code><br />
+<span style="color:grey">HoodieWriteClient.insert和HoodieWriteClient.upsert返回一个持久的RDD[WriteStatus],
+这是因为客户端可以选择检查WriteStatus并根据失败选择是否提交。这是此RDD的存储级别的配置</span></p>
+
+<h4 id="withAutoCommit">withAutoCommit(autoCommit = true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.auto.commit</code><br />
+<span style="color:grey">插入和插入更新后,HoodieWriteClient是否应该自动提交。
+客户端可以选择关闭自动提交,并在”定义的成功条件”下提交</span></p>
+
+<h4 id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning = false)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
+<span style="color:grey">HoodieWriteClient是否应该假设数据按日期划分,即从基本路径划分为三个级别。
+这是支持&lt;0.3.1版本创建的表的一个补丁。最终将被删除</span></p>
+
+<h4 id="withConsistencyCheckEnabled">withConsistencyCheckEnabled(enabled = false)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.consistency.check.enabled</code><br />
+<span style="color:grey">HoodieWriteClient是否应该执行其他检查,以确保写入的文件在基础文件系统/存储上可列出。
+将其设置为true可以解决S3的最终一致性模型,并确保作为提交的一部分写入的所有数据均能准确地用于查询。</span></p>
+
+<h3 id="索引配置">索引配置</h3>
+<p>以下配置控制索引行为,该行为将传入记录标记为对较旧记录的插入或更新。</p>
+
+<p><a href="#withIndexConfig">withIndexConfig</a> (HoodieIndexConfig) <br />
+<span style="color:grey">可插入以具有外部索引(HBase)或使用存储在Parquet文件中的默认布隆过滤器(bloom filter)</span></p>
+
+<h4 id="withIndexType">withIndexType(indexType = BLOOM)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.type</code> <br />
+<span style="color:grey">要使用的索引类型。默认为布隆过滤器。可能的选项是[BLOOM | HBASE | INMEMORY]。
+布隆过滤器消除了对外部系统的依赖,并存储在Parquet数据文件的页脚中</span></p>
+
+<h4 id="bloomFilterNumEntries">bloomFilterNumEntries(numEntries = 60000)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.bloom.num_entries</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />这是要存储在布隆过滤器中的条目数。
+我们假设maxParquetFileSize为128MB,averageRecordSize为1024B,因此,一个文件中的记录总数约为130K。
+默认值(60000)大约是此近似值的一半。<a href="https://issues.apache.org/jira/browse/HUDI-56">HUDI-56</a>
+描述了如何动态地对此进行计算。
+警告:将此值设置得太低,将产生很多误报,并且索引查找将必须扫描比其所需的更多的文件;如果将其设置得非常高,将线性增加每个数据文件的大小(每50000个条目大约4KB)。</span></p>
+
+<h4 id="bloomFilterFPP">bloomFilterFPP(fpp = 0.000000001)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.bloom.fpp</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />根据条目数允许的错误率。
+这用于计算应为布隆过滤器分配多少位以及哈希函数的数量。通常将此值设置得很低(默认值:0.000000001),我们希望在磁盘空间上进行权衡以降低误报率</span></p>
+
+<h4 id="bloomIndexPruneByRanges">bloomIndexPruneByRanges(pruneRanges = true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.prune.by.ranges</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,从文件框定信息,可以加快索引查找的速度。 如果键具有单调递增的前缀,例如时间戳,则特别有用。</span></p>
+
+<h4 id="bloomIndexUseCaching">bloomIndexUseCaching(useCaching = true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.use.caching</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,将通过减少用于计算并行度或受影响分区的IO来缓存输入的RDD以加快索引查找</span></p>
+
+<h4 id="bloomIndexTreebasedFilter">bloomIndexTreebasedFilter(useTreeFilter = true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.use.treebased.filter</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,启用基于间隔树的文件过滤优化。与暴力模式相比,此模式可根据键范围加快文件过滤速度</span></p>
+
+<h4 id="bloomIndexBucketizedChecking">bloomIndexBucketizedChecking(bucketizedChecking = true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.bucketized.checking</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />为true时,启用了桶式布隆过滤。这减少了在基于排序的布隆索引查找中看到的偏差</span></p>
+
+<h4 id="bloomIndexKeysPerBucket">bloomIndexKeysPerBucket(keysPerBucket = 10000000)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.keys.per.bucket</code> <br />
+<span style="color:grey">仅在启用bloomIndexBucketizedChecking并且索引类型为bloom的情况下适用。<br />
+此配置控制“存储桶”的大小,该大小可跟踪对单个文件进行的记录键检查的次数,并且是分配给执行布隆过滤器查找的每个分区的工作单位。
+较高的值将分摊将布隆过滤器读取到内存的固定成本。</span></p>
+
+<h4 id="bloomIndexParallelism">bloomIndexParallelism(0)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.bloom.index.parallelism</code> <br />
+<span style="color:grey">仅在索引类型为BLOOM时适用。<br />这是索引查找的并行度,其中涉及Spark Shuffle。 默认情况下,这是根据输入的工作负载特征自动计算的</span></p>
+
+<h4 id="hbaseZkQuorum">hbaseZkQuorum(zkString) [必须]</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.zkquorum</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。要连接的HBase ZK Quorum URL。</span></p>
+
+<h4 id="hbaseZkPort">hbaseZkPort(port) [必须]</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.zkport</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。要连接的HBase ZK Quorum端口。</span></p>
+
+<h4 id="hbaseTableName">hbaseZkZnodeParent(zkZnodeParent)  [必须]</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.zknode.path</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。这是根znode,它将包含HBase创建及使用的所有znode。</span></p>
+
+<h4 id="hbaseTableName">hbaseTableName(tableName)  [必须]</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.index.hbase.table</code> <br />
+<span style="color:grey">仅在索引类型为HBASE时适用。HBase表名称,用作索引。Hudi将row_key和[partition_path, fileID, commitTime]映射存储在表中。</span></p>
+
+<h3 id="存储选项">存储选项</h3>
+<p>控制有关调整parquet和日志文件大小的方面。</p>
+
+<p><a href="#withStorageConfig">withStorageConfig</a> (HoodieStorageConfig) <br /></p>
+
+<h4 id="limitFileSize">limitFileSize (size = 120MB)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.max.file.size</code> <br />
+<span style="color:grey">Hudi写阶段生成的parquet文件的目标大小。对于DFS,这需要与基础文件系统块大小保持一致,以实现最佳性能。</span></p>
+
+<h4 id="parquetBlockSize">parquetBlockSize(rowgroupsize = 120MB)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.block.size</code> <br />
+<span style="color:grey">Parquet行组大小。最好与文件大小相同,以便将文件中的单个列连续存储在磁盘上</span></p>
+
+<h4 id="parquetPageSize">parquetPageSize(pagesize = 1MB)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.page.size</code> <br />
+<span style="color:grey">Parquet页面大小。页面是parquet文件中的读取单位。 在一个块内,页面被分别压缩。</span></p>
+
+<h4 id="parquetCompressionRatio">parquetCompressionRatio(parquetCompressionRatio = 0.1)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.compression.ratio</code> <br />
+<span style="color:grey">当Hudi尝试调整新parquet文件的大小时,预期对parquet数据进行压缩的比例。
+如果bulk_insert生成的文件小于预期大小,请增加此值</span></p>
+
+<h4 id="parquetCompressionCodec">parquetCompressionCodec(parquetCompressionCodec = gzip)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
+<span style="color:grey">Parquet压缩编解码方式名称。默认值为gzip。可能的选项是[gzip | snappy | uncompressed | lzo]</span></p>
+
+<h4 id="logFileMaxSize">logFileMaxSize(logFileSize = 1GB)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.logfile.max.size</code> <br />
+<span style="color:grey">LogFile的最大大小。这是在将日志文件移到下一个版本之前允许的最大大小。</span></p>
+
+<h4 id="logFileDataBlockMaxSize">logFileDataBlockMaxSize(dataBlockSize = 256MB)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.logfile.data.block.max.size</code> <br />
+<span style="color:grey">LogFile数据块的最大大小。这是允许将单个数据块附加到日志文件的最大大小。
+这有助于确保附加到日志文件的数据被分解为可调整大小的块,以防止发生OOM错误。此大小应大于JVM内存。</span></p>
+
+<h4 id="logFileToParquetCompressionRatio">logFileToParquetCompressionRatio(logFileToParquetCompressionRatio = 0.35)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.logfile.to.parquet.compression.ratio</code> <br />
+<span style="color:grey">随着记录从日志文件移动到parquet,预期会进行额外压缩的比例。
+用于merge_on_read存储,以将插入内容发送到日志文件中并控制压缩parquet文件的大小。</span></p>
+
+<h4 id="parquetCompressionCodec">parquetCompressionCodec(parquetCompressionCodec = gzip)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
+<span style="color:grey">Parquet文件的压缩编解码方式</span></p>
+
+<h3 id="压缩配置">压缩配置</h3>
+<p>压缩配置用于控制压缩(将日志文件合并到新的parquet基本文件中)、清理(回收较旧及未使用的文件组)。
+<a href="#withCompactionConfig">withCompactionConfig</a> (HoodieCompactionConfig) <br /></p>
+
+<h4 id="withCleanerPolicy">withCleanerPolicy(policy = KEEP_LATEST_COMMITS)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.cleaner.policy</code> <br />
+<span style="color:grey">要使用的清理政策。Hudi将删除旧版本的parquet文件以回收空间。
+任何引用此版本文件的查询和计算都将失败。最好确保数据保留的时间超过最大查询执行时间。</span></p>
+
+<h4 id="retainCommits">retainCommits(no_of_commits_to_retain = 24)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.cleaner.commits.retained</code> <br />
+<span style="color:grey">保留的提交数。因此,数据将保留为num_of_commits * time_between_commits(计划的)。
+这也直接转化为您可以逐步提取此数据集的数量</span></p>
+
+<h4 id="archiveCommitsWith">archiveCommitsWith(minCommits = 96, maxCommits = 128)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.keep.min.commits</code>, <code class="highlighter-rouge">hoodie.keep.max.commits</code> <br />
+<span style="color:grey">每个提交都是<code class="highlighter-rouge">.hoodie</code>目录中的一个小文件。由于DFS通常不支持大量小文件,因此Hudi将较早的提交归档到顺序日志中。
+提交通过重命名提交文件以原子方式发布。</span></p>
+
+<h4 id="withCommitsArchivalBatchSize">withCommitsArchivalBatchSize(batch = 10)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.commits.archival.batch</code> <br />
+<span style="color:grey">这控制着批量读取并一起归档的提交即时的数量。</span></p>
+
+<h4 id="compactionSmallFileSize">compactionSmallFileSize(size = 0)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.parquet.small.file.limit</code> <br />
+<span style="color:grey">该值应小于maxFileSize,如果将其设置为0,会关闭此功能。
+由于批处理中分区中插入记录的数量众多,总会出现小文件。
+Hudi提供了一个选项,可以通过将对该分区中的插入作为对现有小文件的更新来解决小文件的问题。
+此处的大小是被视为“小文件大小”的最小文件大小。</span></p>
+
+<h4 id="insertSplitSize">insertSplitSize(size = 500000)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.copyonwrite.insert.split.size</code> <br />
+<span style="color:grey">插入写入并行度。为单个分区的总共插入次数。
+写出100MB的文件,至少1kb大小的记录,意味着每个文件有100K记录。默认值是超额配置为500K。
+为了改善插入延迟,请对其进行调整以匹配单个文件中的记录数。
+将此值设置为较小的值将导致文件变小(尤其是当compactionSmallFileSize为0时)</span></p>
+
+<h4 id="autoTuneInsertSplits">autoTuneInsertSplits(true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.copyonwrite.insert.auto.split</code> <br />
+<span style="color:grey">Hudi是否应该基于最后24个提交的元数据动态计算insertSplitSize。默认关闭。</span></p>
+
+<h4 id="approxRecordSize">approxRecordSize(size = 1024)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.copyonwrite.record.size.estimate</code> <br />
+<span style="color:grey">平均记录大小。如果指定,hudi将使用它,并且不会基于最后24个提交的元数据动态地计算。
+没有默认值设置。这对于计算插入并行度以及将插入打包到小文件中至关重要。如上所述。</span></p>
+
+<h4 id="withInlineCompaction">withInlineCompaction(inlineCompaction = false)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compact.inline</code> <br />
+<span style="color:grey">当设置为true时,紧接在插入或插入更新或批量插入的提交或增量提交操作之后由摄取本身触发压缩</span></p>
+
+<h4 id="withMaxNumDeltaCommitsBeforeCompaction">withMaxNumDeltaCommitsBeforeCompaction(maxNumDeltaCommitsBeforeCompaction = 10)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compact.inline.max.delta.commits</code> <br />
+<span style="color:grey">触发内联压缩之前要保留的最大增量提交数</span></p>
+
+<h4 id="withCompactionLazyBlockReadEnabled">withCompactionLazyBlockReadEnabled(true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.lazy.block.read</code> <br />
+<span style="color:grey">当CompactedLogScanner合并所有日志文件时,此配置有助于选择是否应延迟读取日志块。
+选择true以使用I/O密集型延迟块读取(低内存使用),或者为false来使用内存密集型立即块读取(高内存使用)</span></p>
+
+<h4 id="withCompactionReverseLogReadEnabled">withCompactionReverseLogReadEnabled(false)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.reverse.log.read</code> <br />
+<span style="color:grey">HoodieLogFormatReader会从pos=0到pos=file_length向前读取日志文件。
+如果此配置设置为true,则Reader会从pos=file_length到pos=0反向读取日志文件</span></p>
+
+<h4 id="withCleanerParallelism">withCleanerParallelism(cleanerParallelism = 200)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.cleaner.parallelism</code> <br />
+<span style="color:grey">如果清理变慢,请增加此值。</span></p>
+
+<h4 id="withCompactionStrategy">withCompactionStrategy(compactionStrategy = org.apache.hudi.io.compact.strategy.LogFileSizeBasedCompactionStrategy)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.strategy</code> <br />
+<span style="color:grey">用来决定在每次压缩运行期间选择要压缩的文件组的压缩策略。
+默认情况下,Hudi选择具有累积最多未合并数据的日志文件</span></p>
+
+<h4 id="withTargetIOPerCompactionInMB">withTargetIOPerCompactionInMB(targetIOPerCompactionInMB = 500000)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.target.io</code> <br />
+<span style="color:grey">LogFileSizeBasedCompactionStrategy的压缩运行期间要花费的MB量。当压缩以内联模式运行时,此值有助于限制摄取延迟。</span></p>
+
+<h4 id="withTargetPartitionsPerDayBasedCompaction">withTargetPartitionsPerDayBasedCompaction(targetPartitionsPerCompaction = 10)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.daybased.target</code> <br />
+<span style="color:grey">由org.apache.hudi.io.compact.strategy.DayBasedCompactionStrategy使用,表示在压缩运行期间要压缩的最新分区数。</span></p>
+
+<h4 id="payloadClassName">withPayloadClass(payloadClassName = org.apache.hudi.common.model.HoodieAvroPayload)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.compaction.payload.class</code> <br />
+<span style="color:grey">这需要与插入/插入更新过程中使用的类相同。
+就像写入一样,压缩也使用记录有效负载类将日志中的记录彼此合并,再次与基本文件合并,并生成压缩后要写入的最终记录。</span></p>
+
+<h3 id="指标配置">指标配置</h3>
+<p>能够将Hudi指标报告给graphite。
+<a href="#withMetricsConfig">withMetricsConfig</a> (HoodieMetricsConfig) <br />
+<span style="color:grey">Hudi会发布有关每次提交、清理、回滚等的指标。</span></p>
+
+<h4 id="on">on(metricsOn = true)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.on</code> <br />
+<span style="color:grey">打开或关闭发送指标。默认情况下处于启用状态。</span></p>
+
+<h4 id="withReporterType">withReporterType(reporterType = GRAPHITE)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.reporter.type</code> <br />
+<span style="color:grey">指标报告者的类型。默认使用graphite,也是唯一支持的类型。</span></p>
+
+<h4 id="toGraphiteHost">toGraphiteHost(host = localhost)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.graphite.host</code> <br />
+<span style="color:grey">要连接的graphite主机</span></p>
+
+<h4 id="onGraphitePort">onGraphitePort(port = 4756)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.graphite.port</code> <br />
+<span style="color:grey">要连接的graphite端口</span></p>
+
+<h4 id="usePrefix">usePrefix(prefix = “”)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.metrics.graphite.metric.prefix</code> <br />
+<span style="color:grey">适用于所有指标的标准前缀。这有助于添加如数据中心、环境等信息</span></p>
+
+<h3 id="内存配置">内存配置</h3>
+<p>控制由Hudi内部执行的压缩和合并的内存使用情况
+<a href="#withMemoryConfig">withMemoryConfig</a> (HoodieMemoryConfig) <br />
+<span style="color:grey">内存相关配置</span></p>
+
+<h4 id="withMaxMemoryFractionPerPartitionMerge">withMaxMemoryFractionPerPartitionMerge(maxMemoryFractionPerPartitionMerge = 0.6)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.memory.merge.fraction</code> <br />
+<span style="color:grey">该比例乘以用户内存比例(1-spark.memory.fraction)以获得合并期间要使用的堆空间的最终比例</span></p>
+
+<h4 id="withMaxMemorySizePerCompactionInBytes">withMaxMemorySizePerCompactionInBytes(maxMemorySizePerCompactionInBytes = 1GB)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.memory.compaction.fraction</code> <br />
+<span style="color:grey">HoodieCompactedLogScanner读取日志块,将记录转换为HoodieRecords,然后合并这些日志块和记录。
+在任何时候,日志块中的条目数可以小于或等于相应的parquet文件中的条目数。这可能导致Scanner出现OOM。
+因此,可溢出的映射有助于减轻内存压力。使用此配置来设置可溢出映射的最大允许inMemory占用空间。</span></p>
+
+<h4 id="withWriteStatusFailureFraction">withWriteStatusFailureFraction(failureFraction = 0.1)</h4>
+<p>属性:<code class="highlighter-rouge">hoodie.memory.writestatus.failure.fraction</code> <br />
+<span style="color:grey">此属性控制报告给驱动程序的失败记录和异常的比例</span></p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/cn/docs/0.5.1-deployment.html b/content/cn/docs/0.5.1-deployment.html
new file mode 100644
index 0000000..7755220
--- /dev/null
+++ b/content/cn/docs/0.5.1-deployment.html
@@ -0,0 +1,813 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>管理 Hudi Pipelines - Apache Hudi</title>
+<meta name="description" content="管理员/运维人员可以通过以下方式了解Hudi数据集/管道">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="管理 Hudi Pipelines">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-deployment.html">
+
+
+  <meta property="og:description" content="管理员/运维人员可以通过以下方式了解Hudi数据集/管道">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="active">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">管理 Hudi Pipelines
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#admin-cli">Admin CLI</a>
+    <ul>
+      <li><a href="#检查提交">检查提交</a></li>
+      <li><a href="#深入到特定的提交">深入到特定的提交</a></li>
+      <li><a href="#文件系统视图">文件系统视图</a></li>
+      <li><a href="#统计信息">统计信息</a></li>
+      <li><a href="#归档的提交">归档的提交</a></li>
+      <li><a href="#压缩">压缩</a></li>
+      <li><a href="#验证压缩">验证压缩</a></li>
+      <li><a href="#注意">注意</a></li>
+      <li><a href="#取消调度压缩">取消调度压缩</a></li>
+      <li><a href="#修复压缩">修复压缩</a></li>
+    </ul>
+  </li>
+  <li><a href="#metrics">指标</a></li>
+  <li><a href="#troubleshooting">故障排除</a>
+    <ul>
+      <li><a href="#缺失记录">缺失记录</a></li>
+      <li><a href="#重复">重复</a></li>
+      <li><a href="#spark-ui">Spark故障</a></li>
+    </ul>
+  </li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>管理员/运维人员可以通过以下方式了解Hudi数据集/管道</p>
+
+<ul>
+  <li><a href="#admin-cli">通过Admin CLI进行管理</a></li>
+  <li><a href="#metrics">Graphite指标</a></li>
+  <li><a href="#spark-ui">Hudi应用程序的Spark UI</a></li>
+</ul>
+
+<p>本节简要介绍了每一种方法,并提供了有关<a href="#troubleshooting">故障排除</a>的一些常规指南</p>
+
+<h2 id="admin-cli">Admin CLI</h2>
+
+<p>一旦构建了hudi,就可以通过<code class="highlighter-rouge">cd hudi-cli &amp;&amp; ./hudi-cli.sh</code>启动shell。
+一个hudi数据集位于DFS上的<strong>basePath</strong>位置,我们需要该位置才能连接到Hudi数据集。
+Hudi库使用.hoodie子文件夹跟踪所有元数据,从而有效地在内部管理该数据集。</p>
+
+<p>初始化hudi表,可使用如下命令。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mo">06</span> <span class="mi">15</span><span class="o">:</span><span class="mi">56</span><span class="o">:</span><span class="mi">52</span> <span class="no">INFO</span> <span class="n">annotation</span><span class="o">.</span><span class="na">AutowiredAnnotationBeanPostProcessor</ [...]
+<span class="o">============================================</span>
+<span class="o">*</span>                                          <span class="o">*</span>
+<span class="o">*</span>     <span class="n">_</span>    <span class="n">_</span>           <span class="n">_</span>   <span class="n">_</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span> <span class="o">|</span>  <span class="o">|</span> <span class="o">|</span>         <span class="o">|</span> <span class="o">|</span> <span class="o">(</span><span class="n">_</span><span class="o">)</span>              <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span> <span class="o">|</span><span class="n">__</span><span class="o">|</span> <span class="o">|</span>       <span class="n">__</span><span class="o">|</span> <span class="o">|</span>  <span class="o">-</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span>  <span class="n">__</span>  <span class="o">||</span>   <span class="o">|</span> <span class="o">/</span> <span class="n">_</span><span class="err">`</span> <span class="o">|</span> <span class="o">||</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span> <span class="o">|</span>  <span class="o">|</span> <span class="o">||</span>   <span class="o">||</span> <span class="o">(</span><span class="n">_</span><span class="o">|</span> <span class="o">|</span> <span class="o">||</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span><span class="n">_</span><span class="o">|</span>  <span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="err">\</span><span class="n">___</span><span class="o">/</span> <span class="err">\</span><span class="n">____</span><span class="o">/</span> <span class="o">||</span>               <span class="o">*</span>
+<span class="o">*</span>                                          <span class="o">*</span>
+<span class="o">============================================</span>
+
+<span class="nc">Welcome</span> <span class="n">to</span> <span class="nc">Hoodie</span> <span class="no">CLI</span><span class="o">.</span> <span class="nc">Please</span> <span class="n">type</span> <span class="n">help</span> <span class="k">if</span> <span class="n">you</span> <span class="n">are</span> <span class="n">looking</span> <span class="k">for</span> <span class="n">help</span><span class="o">.</span>
+<span class="n">hudi</span><span class="o">-&gt;</span><span class="n">create</span> <span class="o">--</span><span class="n">path</span> <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">hive</span><span class="o">/</span><span class="n">warehouse</span><span class="o">/</span><span class="n">table1</span> <span class="o">--</span><span class="n">tableName</span> <span class="n">hoodie_table_1</span> <span class="o">--</span><span class="n">table [...]
+<span class="o">.....</span>
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mo">06</span> <span class="mi">15</span><span class="o">:</span><span class="mi">57</span><span class="o">:</span><span class="mi">15</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
+</code></pre></div></div>
+
+<p>使用desc命令可以查看hudi表的描述信息:</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">hoodie_table_1</span><span class="o">-&gt;</span><span class="n">desc</span>
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mo">06</span> <span class="mi">15</span><span class="o">:</span><span class="mi">57</span><span class="o">:</span><span class="mi">19</span> <span class="no">INFO</span> <span class="n">timeline</span><span class="o">.</span><span class="na">HoodieActiveTimeline</span><span class="o">:</span> <span class="nc">Loaded</span> <span class="n">instants</span> <span class="o">[]</span>
+    <span class="n">_________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">Property</span>                <span class="o">|</span> <span class="nc">Value</span>                        <span class="o">|</span>
+    <span class="o">|========================================================|</span>
+    <span class="o">|</span> <span class="n">basePath</span>                <span class="o">|</span> <span class="o">...</span>                          <span class="o">|</span>
+    <span class="o">|</span> <span class="n">metaPath</span>                <span class="o">|</span> <span class="o">...</span>                          <span class="o">|</span>
+    <span class="o">|</span> <span class="n">fileSystem</span>              <span class="o">|</span> <span class="n">hdfs</span>                         <span class="o">|</span>
+    <span class="o">|</span> <span class="n">hoodie</span><span class="o">.</span><span class="na">table</span><span class="o">.</span><span class="na">name</span>       <span class="o">|</span> <span class="n">hoodie_table_1</span>               <span class="o">|</span>
+    <span class="o">|</span> <span class="n">hoodie</span><span class="o">.</span><span class="na">table</span><span class="o">.</span><span class="na">type</span>       <span class="o">|</span> <span class="no">COPY_ON_WRITE</span>                <span class="o">|</span>
+    <span class="o">|</span> <span class="n">hoodie</span><span class="o">.</span><span class="na">archivelog</span><span class="o">.</span><span class="na">folder</span><span class="o">|</span>                              <span class="o">|</span>
+</code></pre></div></div>
+
+<p>以下是连接到包含uber trips的Hudi数据集的示例命令。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">connect</span> <span class="o">--</span><span class="n">path</span> <span class="o">/</span><span class="n">app</span><span class="o">/</span><span class="n">uber</span><span class="o">/</span><span class="n">trips</span>
+
+<span class="mi">16</span><span class="o">/</span><span class="mi">10</span><span class="o">/</span><span class="mo">05</span> <span class="mi">23</span><span class="o">:</span><span class="mi">20</span><span class="o">:</span><span class="mi">37</span> <span class="no">INFO</span> <span class="n">model</span><span class="o">.</span><span class="na">HoodieTableMetadata</span><span class="o">:</span> <span class="nc">Attempting</span> <span class="n">to</span> <span class="n">load</span>  [...]
+<span class="mi">16</span><span class="o">/</span><span class="mi">10</span><span class="o">/</span><span class="mo">05</span> <span class="mi">23</span><span class="o">:</span><span class="mi">20</span><span class="o">:</span><span class="mi">37</span> <span class="no">INFO</span> <span class="n">model</span><span class="o">.</span><span class="na">HoodieTableMetadata</span><span class="o">:</span> <span class="nc">Attempting</span> <span class="n">to</span> <span class="n">load</span>  [...]
+<span class="mi">16</span><span class="o">/</span><span class="mi">10</span><span class="o">/</span><span class="mo">05</span> <span class="mi">23</span><span class="o">:</span><span class="mi">20</span><span class="o">:</span><span class="mi">37</span> <span class="no">INFO</span> <span class="n">model</span><span class="o">.</span><span class="na">HoodieTableMetadata</span><span class="o">:</span> <span class="nc">All</span> <span class="n">commits</span> <span class="o">:</span><span  [...]
+<span class="nc">Metadata</span> <span class="k">for</span> <span class="n">table</span> <span class="n">trips</span> <span class="n">loaded</span>
+<span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span>
+</code></pre></div></div>
+
+<p>连接到数据集后,便可使用许多其他命令。该shell程序具有上下文自动完成帮助(按TAB键),下面是所有命令的列表,本节中对其中的一些命令进行了详细示例。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">help</span>
+<span class="o">*</span> <span class="o">!</span> <span class="o">-</span> <span class="nc">Allows</span> <span class="n">execution</span> <span class="n">of</span> <span class="n">operating</span> <span class="nf">system</span> <span class="o">(</span><span class="no">OS</span><span class="o">)</span> <span class="n">commands</span>
+<span class="o">*</span> <span class="c1">// - Inline comment markers (start of line only)</span>
+<span class="o">*</span> <span class="o">;</span> <span class="o">-</span> <span class="nc">Inline</span> <span class="n">comment</span> <span class="nf">markers</span> <span class="o">(</span><span class="n">start</span> <span class="n">of</span> <span class="n">line</span> <span class="n">only</span><span class="o">)</span>
+<span class="o">*</span> <span class="n">addpartitionmeta</span> <span class="o">-</span> <span class="nc">Add</span> <span class="n">partition</span> <span class="n">metadata</span> <span class="n">to</span> <span class="n">a</span> <span class="n">dataset</span><span class="o">,</span> <span class="k">if</span> <span class="n">not</span> <span class="n">present</span>
+<span class="o">*</span> <span class="n">clear</span> <span class="o">-</span> <span class="nc">Clears</span> <span class="n">the</span> <span class="n">console</span>
+<span class="o">*</span> <span class="n">cls</span> <span class="o">-</span> <span class="nc">Clears</span> <span class="n">the</span> <span class="n">console</span>
+<span class="o">*</span> <span class="n">commit</span> <span class="n">rollback</span> <span class="o">-</span> <span class="nc">Rollback</span> <span class="n">a</span> <span class="n">commit</span>
+<span class="o">*</span> <span class="n">commits</span> <span class="n">compare</span> <span class="o">-</span> <span class="nc">Compare</span> <span class="n">commits</span> <span class="n">with</span> <span class="n">another</span> <span class="nc">Hoodie</span> <span class="n">dataset</span>
+<span class="o">*</span> <span class="n">commit</span> <span class="n">showfiles</span> <span class="o">-</span> <span class="nc">Show</span> <span class="n">file</span> <span class="n">level</span> <span class="n">details</span> <span class="n">of</span> <span class="n">a</span> <span class="n">commit</span>
+<span class="o">*</span> <span class="n">commit</span> <span class="n">showpartitions</span> <span class="o">-</span> <span class="nc">Show</span> <span class="n">partition</span> <span class="n">level</span> <span class="n">details</span> <span class="n">of</span> <span class="n">a</span> <span class="n">commit</span>
+<span class="o">*</span> <span class="n">commits</span> <span class="n">refresh</span> <span class="o">-</span> <span class="nc">Refresh</span> <span class="n">the</span> <span class="n">commits</span>
+<span class="o">*</span> <span class="n">commits</span> <span class="n">show</span> <span class="o">-</span> <span class="nc">Show</span> <span class="n">the</span> <span class="n">commits</span>
+<span class="o">*</span> <span class="n">commits</span> <span class="n">sync</span> <span class="o">-</span> <span class="nc">Compare</span> <span class="n">commits</span> <span class="n">with</span> <span class="n">another</span> <span class="nc">Hoodie</span> <span class="n">dataset</span>
+<span class="o">*</span> <span class="n">connect</span> <span class="o">-</span> <span class="nc">Connect</span> <span class="n">to</span> <span class="n">a</span> <span class="n">hoodie</span> <span class="n">dataset</span>
+<span class="o">*</span> <span class="n">date</span> <span class="o">-</span> <span class="nc">Displays</span> <span class="n">the</span> <span class="n">local</span> <span class="n">date</span> <span class="n">and</span> <span class="n">time</span>
+<span class="o">*</span> <span class="n">exit</span> <span class="o">-</span> <span class="nc">Exits</span> <span class="n">the</span> <span class="n">shell</span>
+<span class="o">*</span> <span class="n">help</span> <span class="o">-</span> <span class="nc">List</span> <span class="n">all</span> <span class="n">commands</span> <span class="n">usage</span>
+<span class="o">*</span> <span class="n">quit</span> <span class="o">-</span> <span class="nc">Exits</span> <span class="n">the</span> <span class="n">shell</span>
+<span class="o">*</span> <span class="n">records</span> <span class="n">deduplicate</span> <span class="o">-</span> <span class="nc">De</span><span class="o">-</span><span class="n">duplicate</span> <span class="n">a</span> <span class="n">partition</span> <span class="n">path</span> <span class="n">contains</span> <span class="n">duplicates</span> <span class="o">&amp;</span> <span class="n">produce</span> <span class="n">repaired</span> <span class="n">files</span> <span class="n">to</ [...]
+<span class="o">*</span> <span class="n">script</span> <span class="o">-</span> <span class="nc">Parses</span> <span class="n">the</span> <span class="n">specified</span> <span class="n">resource</span> <span class="n">file</span> <span class="n">and</span> <span class="n">executes</span> <span class="n">its</span> <span class="n">commands</span>
+<span class="o">*</span> <span class="n">stats</span> <span class="n">filesizes</span> <span class="o">-</span> <span class="nc">File</span> <span class="nc">Sizes</span><span class="o">.</span> <span class="nc">Display</span> <span class="n">summary</span> <span class="n">stats</span> <span class="n">on</span> <span class="n">sizes</span> <span class="n">of</span> <span class="n">files</span>
+<span class="o">*</span> <span class="n">stats</span> <span class="n">wa</span> <span class="o">-</span> <span class="nc">Write</span> <span class="nc">Amplification</span><span class="o">.</span> <span class="nc">Ratio</span> <span class="n">of</span> <span class="n">how</span> <span class="n">many</span> <span class="n">records</span> <span class="n">were</span> <span class="n">upserted</span> <span class="n">to</span> <span class="n">how</span> <span class="n">many</span> <span class= [...]
+<span class="o">*</span> <span class="n">sync</span> <span class="n">validate</span> <span class="o">-</span> <span class="nc">Validate</span> <span class="n">the</span> <span class="n">sync</span> <span class="n">by</span> <span class="n">counting</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">records</span>
+<span class="o">*</span> <span class="n">system</span> <span class="n">properties</span> <span class="o">-</span> <span class="nc">Shows</span> <span class="n">the</span> <span class="n">shell</span><span class="err">'</span><span class="n">s</span> <span class="n">properties</span>
+<span class="o">*</span> <span class="n">utils</span> <span class="n">loadClass</span> <span class="o">-</span> <span class="nc">Load</span> <span class="n">a</span> <span class="kd">class</span>
+<span class="err">*</span> <span class="nc">version</span> <span class="o">-</span> <span class="nc">Displays</span> <span class="n">shell</span> <span class="n">version</span>
+
+<span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span>
+</code></pre></div></div>
+
+<h3 id="检查提交">检查提交</h3>
+
+<p>在Hudi中,更新或插入一批记录的任务被称为<strong>提交</strong>。提交可提供基本的原子性保证,即只有提交的数据可用于查询。
+每个提交都有一个单调递增的字符串/数字,称为<strong>提交编号</strong>。通常,这是我们开始提交的时间。</p>
+
+<p>查看有关最近10次提交的一些基本信息,</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">commits</span> <span class="n">show</span> <span class="o">--</span><span class="n">sortBy</span> <span class="s">"Total Bytes Written"</span> <span class="o">--</span><span class="n">desc</span> <span class="kc">true</span> <span class="o">--</span><span class="n">limit</span> <span class=" [...]
+    <span class="n">________________________________________________________________________________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">CommitTime</span>    <span class="o">|</span> <span class="nc">Total</span> <span class="nc">Bytes</span> <span class="nc">Written</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Files</span> <span class="nc">Added</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Files</span> <span class="nc">Updated</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Partiti [...]
+    <span class="o">|=======================================================================================================================================================================|</span>
+    <span class="o">....</span>
+    <span class="o">....</span>
+    <span class="o">....</span>
+<span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span>
+</code></pre></div></div>
+
+<p>在每次写入开始时,Hudi还将.inflight提交写入.hoodie文件夹。您可以使用那里的时间戳来估计正在进行的提交已经花费的时间</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">$</span> <span class="n">hdfs</span> <span class="n">dfs</span> <span class="o">-</span><span class="n">ls</span> <span class="o">/</span><span class="n">app</span><span class="o">/</span><span class="n">uber</span><span class="o">/</span><span class="n">trips</span><span class="o">/.</span><span class="na">hoodie</span><span class="o">/*.</span><span class="na">inflight</span>
+<span class="o">-</span><span class="n">rw</span><span class="o">-</span><span class="n">r</span><span class="o">--</span><span class="n">r</span><span class="o">--</span>   <span class="mi">3</span> <span class="n">vinoth</span> <span class="n">supergroup</span>     <span class="mi">321984</span> <span class="mi">2016</span><span class="o">-</span><span class="mi">10</span><span class="o">-</span><span class="mo">05</span> <span class="mi">23</span><span class="o">:</span><span class="m [...]
+</code></pre></div></div>
+
+<h3 id="深入到特定的提交">深入到特定的提交</h3>
+
+<p>了解写入如何分散到特定分区,</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">commit</span> <span class="n">showpartitions</span> <span class="o">--</span><span class="n">commit</span> <span class="mi">20161005165855</span> <span class="o">--</span><span class="n">sortBy</span> <span class="s">"Total Bytes Written"</span> <span class="o">--</span><span class="n">desc< [...]
+    <span class="n">__________________________________________________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">Partition</span> <span class="nc">Path</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Files</span> <span class="nc">Added</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Files</span> <span class="nc">Updated</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Records</span> <span class="nc">Inserted</span><span class="o">|</span> <span class="nc">Total</spa [...]
+    <span class="o">|=========================================================================================================================================|</span>
+     <span class="o">....</span>
+     <span class="o">....</span>
+</code></pre></div></div>
+
+<p>如果您需要文件级粒度,我们可以执行以下操作</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">commit</span> <span class="n">showfiles</span> <span class="o">--</span><span class="n">commit</span> <span class="mi">20161005165855</span> <span class="o">--</span><span class="n">sortBy</span> <span class="s">"Partition Path"</span>
+    <span class="n">________________________________________________________________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">Partition</span> <span class="nc">Path</span><span class="o">|</span> <span class="nc">File</span> <span class="no">ID</span>                             <span class="o">|</span> <span class="nc">Previous</span> <span class="nc">Commit</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Records</span> <span class="nc">Updated</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Records</span> [...]
+    <span class="o">|=======================================================================================================================================================|</span>
+    <span class="o">....</span>
+    <span class="o">....</span>
+</code></pre></div></div>
+
+<h3 id="文件系统视图">文件系统视图</h3>
+
+<p>Hudi将每个分区视为文件组的集合,每个文件组包含按提交顺序排列的文件切片列表(请参阅概念)。以下命令允许用户查看数据集的文件切片。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span><span class="n">show</span> <span class="n">fsview</span> <span class="n">all</span>
+ <span class="o">....</span>
+  <span class="n">_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________</span>
+ <span class="o">|</span> <span class="nc">Partition</span> <span class="o">|</span> <span class="nc">FileId</span> <span class="o">|</span> <span class="nc">Base</span><span class="o">-</span><span class="nc">Instant</span> <span class="o">|</span> <span class="nc">Data</span><span class="o">-</span><span class="nc">File</span> <span class="o">|</span> <span class="nc">Data</span><span class="o">-</span><span class="nc">File</span> <span class="nc">Size</span><span class="o">|</span> <s [...]
+ <span class="o">|==============================================================================================================================================================================================================================================================================================================================================================================================================|</span>
+ <span class="o">|</span> <span class="mi">2018</span><span class="o">/</span><span class="mi">08</span><span class="o">/</span><span class="mi">31</span><span class="o">|</span> <span class="mi">111415</span><span class="n">c3</span><span class="o">-</span><span class="n">f26d</span><span class="o">-</span><span class="mi">4639</span><span class="o">-</span><span class="mi">86</span><span class="n">c8</span><span class="o">-</span><span class="n">f9956f245ac3</span><span class="o">|</sp [...]
+
+
+
+ <span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span><span class="n">show</span> <span class="n">fsview</span> <span class="n">latest</span> <span class="o">--</span><span class="n">partitionPath</span> <span class="s">"2018/08/31"</span>
+ <span class="o">......</span>
+ <span class="n">___________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ [...]
+ <span class="o">|</span> <span class="nc">Partition</span> <span class="o">|</span> <span class="nc">FileId</span> <span class="o">|</span> <span class="nc">Base</span><span class="o">-</span><span class="nc">Instant</span> <span class="o">|</span> <span class="nc">Data</span><span class="o">-</span><span class="nc">File</span> <span class="o">|</span> <span class="nc">Data</span><span class="o">-</span><span class="nc">File</span> <span class="nc">Size</span><span class="o">|</span> <s [...]
+ <span class="o">|========================================================================================================================================================================================================================================================================================================================================================================================================================================================================================== [...]
+ <span class="o">|</span> <span class="mi">2018</span><span class="o">/</span><span class="mi">08</span><span class="o">/</span><span class="mi">31</span><span class="o">|</span> <span class="mi">111415</span><span class="n">c3</span><span class="o">-</span><span class="n">f26d</span><span class="o">-</span><span class="mi">4639</span><span class="o">-</span><span class="mi">86</span><span class="n">c8</span><span class="o">-</span><span class="n">f9956f245ac3</span><span class="o">|</sp [...]
+
+ <span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span>
+</code></pre></div></div>
+
+<h3 id="统计信息">统计信息</h3>
+
+<p>由于Hudi直接管理DFS数据集的文件大小,这些信息会帮助你全面了解Hudi的运行状况</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">stats</span> <span class="n">filesizes</span> <span class="o">--</span><span class="n">partitionPath</span> <span class="mi">2016</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mo">01</span> <span class="o">--</span><span class="n">sortBy</span>  [...]
+    <span class="n">________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">CommitTime</span>    <span class="o">|</span> <span class="nc">Min</span>     <span class="o">|</span> <span class="mi">10</span><span class="n">th</span>    <span class="o">|</span> <span class="mi">50</span><span class="n">th</span>    <span class="o">|</span> <span class="n">avg</span>     <span class="o">|</span> <span class="mi">95</span><span class="n">th</span>    <span class="o">|</span> <span class="nc">Max</span>     <span class="o" [...]
+    <span class="o">|===============================================================================================|</span>
+    <span class="o">|</span> <span class="o">&lt;</span><span class="no">COMMIT_ID</span><span class="o">&gt;</span>   <span class="o">|</span> <span class="mf">93.9</span> <span class="no">MB</span> <span class="o">|</span> <span class="mf">93.9</span> <span class="no">MB</span> <span class="o">|</span> <span class="mf">93.9</span> <span class="no">MB</span> <span class="o">|</span> <span class="mf">93.9</span> <span class="no">MB</span> <span class="o">|</span> <span class="mf">93.9</s [...]
+    <span class="o">....</span>
+    <span class="o">....</span>
+</code></pre></div></div>
+
+<p>如果Hudi写入花费的时间更长,那么可以通过观察写放大指标来发现任何异常</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">stats</span> <span class="n">wa</span>
+    <span class="n">__________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">CommitTime</span>    <span class="o">|</span> <span class="nc">Total</span> <span class="nc">Upserted</span><span class="o">|</span> <span class="nc">Total</span> <span class="nc">Written</span><span class="o">|</span> <span class="nc">Write</span> <span class="nc">Amplifiation</span> <span class="nc">Factor</span><span class="o">|</span>
+    <span class="o">|=========================================================================|</span>
+    <span class="o">....</span>
+    <span class="o">....</span>
+</code></pre></div></div>
+
+<h3 id="归档的提交">归档的提交</h3>
+
+<p>为了限制DFS上.commit文件的增长量,Hudi将较旧的.commit文件(适当考虑清理策略)归档到commits.archived文件中。
+这是一个序列文件,其包含commitNumber =&gt; json的映射,及有关提交的原始信息(上面已很好地汇总了相同的信息)。</p>
+
+<h3 id="压缩">压缩</h3>
+
+<p>要了解压缩和写程序之间的时滞,请使用以下命令列出所有待处理的压缩。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">compactions</span> <span class="n">show</span> <span class="n">all</span>
+     <span class="n">___________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">Compaction</span> <span class="nc">Instant</span> <span class="nc">Time</span><span class="o">|</span> <span class="nc">State</span>    <span class="o">|</span> <span class="nc">Total</span> <span class="nc">FileIds</span> <span class="n">to</span> <span class="n">be</span> <span class="nc">Compacted</span><span class="o">|</span>
+    <span class="o">|==================================================================|</span>
+    <span class="o">|</span> <span class="o">&lt;</span><span class="no">INSTANT_1</span><span class="o">&gt;</span>            <span class="o">|</span> <span class="no">REQUESTED</span><span class="o">|</span> <span class="mi">35</span>                           <span class="o">|</span>
+    <span class="o">|</span> <span class="o">&lt;</span><span class="no">INSTANT_2</span><span class="o">&gt;</span>            <span class="o">|</span> <span class="no">INFLIGHT</span> <span class="o">|</span> <span class="mi">27</span>                           <span class="o">|</span>
+</code></pre></div></div>
+
+<p>要检查特定的压缩计划,请使用</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">compaction</span> <span class="n">show</span> <span class="o">--</span><span class="n">instant</span> <span class="o">&lt;</span><span class="no">INSTANT_1</span><span class="o">&gt;</span>
+    <span class="n">_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">Partition</span> <span class="nc">Path</span><span class="o">|</span> <span class="nc">File</span> <span class="nc">Id</span> <span class="o">|</span> <span class="nc">Base</span> <span class="nc">Instant</span>  <span class="o">|</span> <span class="nc">Data</span> <span class="nc">File</span> <span class="nc">Path</span>                                    <span class="o">|</span> <span class="nc">Total</span> <span class="nc">Delta</span> < [...]
+    <span class="o">|================================================================================================================================================================================================================================================</span>
+    <span class="o">|</span> <span class="mi">2018</span><span class="o">/</span><span class="mo">07</span><span class="o">/</span><span class="mi">17</span>    <span class="o">|</span> <span class="o">&lt;</span><span class="no">UUID</span><span class="o">&gt;</span>  <span class="o">|</span> <span class="o">&lt;</span><span class="no">INSTANT_1</span><span class="o">&gt;</span>   <span class="o">|</span> <span class="nl">viewfs:</span><span class="c1">//ns-default/.../../UUID_&lt;INSTA [...]
+
+</code></pre></div></div>
+
+<p>要手动调度或运行压缩,请使用以下命令。该命令使用spark启动器执行压缩操作。
+注意:确保没有其他应用程序正在同时调度此数据集的压缩</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">help</span> <span class="n">compaction</span> <span class="n">schedule</span>
+<span class="nl">Keyword:</span>                   <span class="n">compaction</span> <span class="n">schedule</span>
+<span class="nl">Description:</span>               <span class="nc">Schedule</span> <span class="nc">Compaction</span>
+ <span class="nl">Keyword:</span>                  <span class="n">sparkMemory</span>
+   <span class="nl">Help:</span>                   <span class="nc">Spark</span> <span class="n">executor</span> <span class="n">memory</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">false</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="mi">1</span><span class="no">G</span><span class="err">'</span>
+
+<span class="o">*</span> <span class="n">compaction</span> <span class="n">schedule</span> <span class="o">-</span> <span class="nc">Schedule</span> <span class="nc">Compaction</span>
+</code></pre></div></div>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">help</span> <span class="n">compaction</span> <span class="n">run</span>
+<span class="nl">Keyword:</span>                   <span class="n">compaction</span> <span class="n">run</span>
+<span class="nl">Description:</span>               <span class="nc">Run</span> <span class="nc">Compaction</span> <span class="k">for</span> <span class="n">given</span> <span class="n">instant</span> <span class="n">time</span>
+ <span class="nl">Keyword:</span>                  <span class="n">tableName</span>
+   <span class="nl">Help:</span>                   <span class="nc">Table</span> <span class="n">name</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">true</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+
+ <span class="nl">Keyword:</span>                  <span class="n">parallelism</span>
+   <span class="nl">Help:</span>                   <span class="nc">Parallelism</span> <span class="k">for</span> <span class="n">hoodie</span> <span class="n">compaction</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">true</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+
+ <span class="nl">Keyword:</span>                  <span class="n">schemaFilePath</span>
+   <span class="nl">Help:</span>                   <span class="nc">Path</span> <span class="k">for</span> <span class="nc">Avro</span> <span class="n">schema</span> <span class="n">file</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">true</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+
+ <span class="nl">Keyword:</span>                  <span class="n">sparkMemory</span>
+   <span class="nl">Help:</span>                   <span class="nc">Spark</span> <span class="n">executor</span> <span class="n">memory</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">true</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+
+ <span class="nl">Keyword:</span>                  <span class="n">retry</span>
+   <span class="nl">Help:</span>                   <span class="nc">Number</span> <span class="n">of</span> <span class="n">retries</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">true</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+
+ <span class="nl">Keyword:</span>                  <span class="n">compactionInstant</span>
+   <span class="nl">Help:</span>                   <span class="nc">Base</span> <span class="n">path</span> <span class="k">for</span> <span class="n">the</span> <span class="n">target</span> <span class="n">hoodie</span> <span class="n">dataset</span>
+   <span class="nl">Mandatory:</span>              <span class="kc">true</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">specified:</span>   <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+   <span class="nc">Default</span> <span class="k">if</span> <span class="nl">unspecified:</span> <span class="err">'</span><span class="n">__NULL__</span><span class="err">'</span>
+
+<span class="o">*</span> <span class="n">compaction</span> <span class="n">run</span> <span class="o">-</span> <span class="nc">Run</span> <span class="nc">Compaction</span> <span class="k">for</span> <span class="n">given</span> <span class="n">instant</span> <span class="n">time</span>
+</code></pre></div></div>
+
+<h3 id="验证压缩">验证压缩</h3>
+
+<p>验证压缩计划:检查压缩所需的所有文件是否都存在且有效</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span><span class="n">compaction</span> <span class="n">validate</span> <span class="o">--</span><span class="n">instant</span> <span class="mi">20181005222611</span>
+<span class="o">...</span>
+
+   <span class="no">COMPACTION</span> <span class="no">PLAN</span> <span class="no">VALID</span>
+
+    <span class="n">___________________________________________________________________________________________________________________________________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">File</span> <span class="nc">Id</span>                             <span class="o">|</span> <span class="nc">Base</span> <span class="nc">Instant</span> <span class="nc">Time</span><span class="o">|</span> <span class="nc">Base</span> <span class="nc">Data</span> <span class="nc">File</span>                                                                                                                   <span class="o">|</span> <span class="n [...]
+    <span class="o">|==========================================================================================================================================================================================================================|</span>
+    <span class="o">|</span> <span class="mo">05320</span><span class="n">e98</span><span class="o">-</span><span class="mi">9</span><span class="n">a57</span><span class="o">-</span><span class="mi">4</span><span class="n">c38</span><span class="o">-</span><span class="n">b809</span><span class="o">-</span><span class="n">a6beaaeb36bd</span><span class="o">|</span> <span class="mi">20181005222445</span>   <span class="o">|</span> <span class="nl">hdfs:</span><span class="c1">//namenode: [...]
+
+
+
+<span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span><span class="n">compaction</span> <span class="n">validate</span> <span class="o">--</span><span class="n">instant</span> <span class="mi">20181005222601</span>
+
+   <span class="no">COMPACTION</span> <span class="no">PLAN</span> <span class="no">INVALID</span>
+
+    <span class="n">_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________</span>
+    <span class="o">|</span> <span class="nc">File</span> <span class="nc">Id</span>                             <span class="o">|</span> <span class="nc">Base</span> <span class="nc">Instant</span> <span class="nc">Time</span><span class="o">|</span> <span class="nc">Base</span> <span class="nc">Data</span> <span class="nc">File</span>                                                                                                                   <span class="o">|</span> <span class="n [...]
+    <span class="o">|=====================================================================================================================================================================================================================================================================================================|</span>
+    <span class="o">|</span> <span class="mo">05320</span><span class="n">e98</span><span class="o">-</span><span class="mi">9</span><span class="n">a57</span><span class="o">-</span><span class="mi">4</span><span class="n">c38</span><span class="o">-</span><span class="n">b809</span><span class="o">-</span><span class="n">a6beaaeb36bd</span><span class="o">|</span> <span class="mi">20181005222445</span>   <span class="o">|</span> <span class="nl">hdfs:</span><span class="c1">//namenode: [...]
+</code></pre></div></div>
+
+<h3 id="注意">注意</h3>
+
+<p>必须在其他写入/摄取程序没有运行的情况下执行以下命令。</p>
+
+<p>有时,有必要从压缩计划中删除fileId以便加快或取消压缩操作。
+压缩计划之后在此文件上发生的所有新日志文件都将被安全地重命名以便进行保留。Hudi提供以下CLI来支持</p>
+
+<h3 id="取消调度压缩">取消调度压缩</h3>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">compaction</span> <span class="n">unscheduleFileId</span> <span class="o">--</span><span class="n">fileId</span> <span class="o">&lt;</span><span class="nc">FileUUID</span><span class="o">&gt;</span>
+<span class="o">....</span>
+<span class="nc">No</span> <span class="nc">File</span> <span class="n">renames</span> <span class="n">needed</span> <span class="n">to</span> <span class="n">unschedule</span> <span class="n">file</span> <span class="n">from</span> <span class="n">pending</span> <span class="n">compaction</span><span class="o">.</span> <span class="nc">Operation</span> <span class="n">successful</span><span class="o">.</span>
+</code></pre></div></div>
+
+<p>在其他情况下,需要撤销整个压缩计划。以下CLI支持此功能</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">trips</span><span class="o">-&gt;</span><span class="n">compaction</span> <span class="n">unschedule</span> <span class="o">--</span><span class="n">compactionInstant</span> <span class="o">&lt;</span><span class="n">compactionInstant</span><span class="o">&gt;</span>
+<span class="o">.....</span>
+<span class="nc">No</span> <span class="nc">File</span> <span class="n">renames</span> <span class="n">needed</span> <span class="n">to</span> <span class="n">unschedule</span> <span class="n">pending</span> <span class="n">compaction</span><span class="o">.</span> <span class="nc">Operation</span> <span class="n">successful</span><span class="o">.</span>
+</code></pre></div></div>
+
+<h3 id="修复压缩">修复压缩</h3>
+
+<p>上面的压缩取消调度操作有时可能会部分失败(例如:DFS暂时不可用)。
+如果发生部分故障,则压缩操作可能与文件切片的状态不一致。
+当您运行<code class="highlighter-rouge">压缩验证</code>时,您会注意到无效的压缩操作(如果有的话)。
+在这种情况下,修复命令将立即执行,它将重新排列文件切片,以使文件不丢失,并且文件切片与压缩计划一致</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span><span class="n">compaction</span> <span class="n">repair</span> <span class="o">--</span><span class="n">instant</span> <span class="mi">20181005222611</span>
+<span class="o">......</span>
+<span class="nc">Compaction</span> <span class="n">successfully</span> <span class="n">repaired</span>
+<span class="o">.....</span>
+</code></pre></div></div>
+
+<h2 id="metrics">指标</h2>
+
+<p>为Hudi Client配置正确的数据集名称和指标环境后,它将生成以下graphite指标,以帮助调试hudi数据集</p>
+
+<ul>
+  <li><strong>提交持续时间</strong> - 这是成功提交一批记录所花费的时间</li>
+  <li><strong>回滚持续时间</strong> - 同样,撤消失败的提交所剩余的部分数据所花费的时间(每次写入失败后都会自动发生)</li>
+  <li><strong>文件级别指标</strong> - 显示每次提交中新增、版本、删除(清除)的文件数量</li>
+  <li><strong>记录级别指标</strong> - 每次提交插入/更新的记录总数</li>
+  <li><strong>分区级别指标</strong> - 更新的分区数量(对于了解提交持续时间的突然峰值非常有用)</li>
+</ul>
+
+<p>然后可以将这些指标绘制在grafana等标准工具上。以下是提交持续时间图表示例。</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_commit_duration.png" alt="hudi_commit_duration.png" style="max-width: 100%" />
+</figure>
+
+<h2 id="troubleshooting">故障排除</h2>
+
+<p>以下部分通常有助于调试Hudi故障。以下元数据已被添加到每条记录中,可以通过标准Hadoop SQL引擎(Hive/Presto/Spark)检索,来更容易地诊断问题的严重性。</p>
+
+<ul>
+  <li><strong>_hoodie_record_key</strong> - 作为每个DFS分区内的主键,是所有更新/插入的基础</li>
+  <li><strong>_hoodie_commit_time</strong> - 该记录上次的提交</li>
+  <li><strong>_hoodie_file_name</strong> - 包含记录的实际文件名(对检查重复非常有用)</li>
+  <li><strong>_hoodie_partition_path</strong> - basePath的路径,该路径标识包含此记录的分区</li>
+</ul>
+
+<p>请注意,到目前为止,Hudi假定应用程序为给定的recordKey传递相同的确定性分区路径。即仅在每个分区内保证recordKey(主键)的唯一性。</p>
+
+<h3 id="缺失记录">缺失记录</h3>
+
+<p>请在可能写入记录的窗口中,使用上面的admin命令检查是否存在任何写入错误。
+如果确实发现错误,那么记录实际上不是由Hudi写入的,而是交还给应用程序来决定如何处理。</p>
+
+<h3 id="重复">重复</h3>
+
+<p>首先,请确保访问Hudi数据集的查询是<a href="sql_queries.html">没有问题的</a>,并之后确认的确有重复。</p>
+
+<ul>
+  <li>如果确认,请使用上面的元数据字段来标识包含记录的物理文件和分区文件。</li>
+  <li>如果重复的记录存在于不同分区路径下的文件,则意味着您的应用程序正在为同一recordKey生成不同的分区路径,请修复您的应用程序.</li>
+  <li>如果重复的记录存在于同一分区路径下的多个文件,请使用邮件列表汇报这个问题。这不应该发生。您可以使用<code class="highlighter-rouge">records deduplicate</code>命令修复数据。</li>
+</ul>
+
+<h3 id="spark-ui">Spark故障</h3>
+
+<p>典型的upsert() DAG如下所示。请注意,Hudi客户端会缓存中间的RDD,以智能地并调整文件大小和Spark并行度。
+另外,由于还显示了探针作业,Spark UI显示了两次sortByKey,但它只是一个排序。</p>
+<figure>
+    <img class="docimage" src="/assets/images/hudi_upsert_dag.png" alt="hudi_upsert_dag.png" style="max-width: 100%" />
+</figure>
+
+<p>概括地说,有两个步骤</p>
+
+<p><strong>索引查找以标识要更改的文件</strong></p>
+
+<ul>
+  <li>Job 1 : 触发输入数据读取,转换为HoodieRecord对象,然后根据输入记录拿到目标分区路径。</li>
+  <li>Job 2 : 加载我们需要检查的文件名集。</li>
+  <li>Job 3  &amp; 4 : 通过联合上面1和2中的RDD,智能调整spark join并行度,然后进行实际查找。</li>
+  <li>Job 5 : 生成带有位置的recordKeys作为标记的RDD。</li>
+</ul>
+
+<p><strong>执行数据的实际写入</strong></p>
+
+<ul>
+  <li>Job 6 : 将记录与recordKey(位置)进行懒惰连接,以提供最终的HoodieRecord集,现在它包含每条记录的文件/分区路径信息(如果插入,则为null)。然后还要再次分析工作负载以确定文件的大小。</li>
+  <li>Job 7 : 实际写入数据(更新 + 插入 + 插入转为更新以保持文件大小)</li>
+</ul>
+
+<p>根据异常源(Hudi/Spark),上述关于DAG的信息可用于查明实际问题。最常遇到的故障是由YARN/DFS临时故障引起的。
+将来,将在项目中添加更复杂的调试/管理UI,以帮助自动进行某些调试。</p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/docs/docker_demo.html b/content/cn/docs/0.5.1-docker_demo.html
similarity index 85%
copy from content/docs/docker_demo.html
copy to content/cn/docs/0.5.1-docker_demo.html
index 75f9068..d984196 100644
--- a/content/docs/docker_demo.html
+++ b/content/cn/docs/0.5.1-docker_demo.html
@@ -10,7 +10,7 @@
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
 <meta property="og:title" content="Docker Demo">
-<meta property="og:url" content="https://hudi.apache.org/docs/docker_demo.html">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-docker_demo.html">
 
 
   <meta property="og:description" content="A Demo using docker containers">
@@ -83,15 +83,15 @@
           
         </a>
         <ul class="visible-links"><li class="masthead__menu-item">
-              <a href="/docs/quick-start-guide.html" target="_self" >Documentation</a>
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
             </li><li class="masthead__menu-item">
-              <a href="/community.html" target="_self" >Community</a>
+              <a href="/cn/community.html" target="_self" >社区</a>
             </li><li class="masthead__menu-item">
-              <a href="/activity.html" target="_self" >Activities</a>
+              <a href="/cn/activity.html" target="_self" >动态</a>
             </li><li class="masthead__menu-item">
               <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
             </li><li class="masthead__menu-item">
-              <a href="/releases.html" target="_self" >Releases</a>
+              <a href="/cn/releases.html" target="_self" >发布</a>
             </li></ul>
         <button class="greedy-nav__toggle hidden" type="button">
           <span class="visually-hidden">Toggle menu</span>
@@ -129,12 +129,12 @@
 <nav class="nav__list">
   
   <input id="ac-toc" name="accordion-toc" type="checkbox" />
-  <label for="ac-toc">Toggle Menu</label>
+  <label for="ac-toc">文档菜单</label>
   <ul class="nav__items">
     
       <li>
         
-          <span class="nav__sub-title">Getting Started</span>
+          <span class="nav__sub-title">入门指南</span>
         
 
         
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/docs/quick-start-guide.html" class="">Quick Start</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/docs/use_cases.html" class="">Use Cases</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/docs/powered_by.html" class="">Talks & Powered By</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/docs/comparison.html" class="">Comparison</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/docs/docker_demo.html" class="active">Docker Demo</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="active">Docker 示例</a></li>
             
 
           
@@ -201,7 +201,7 @@
     
       <li>
         
-          <span class="nav__sub-title">Documentation</span>
+          <span class="nav__sub-title">帮助文档</span>
         
 
         
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/docs/concepts.html" class="">Concepts</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/docs/writing_data.html" class="">Writing Data</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/docs/querying_data.html" class="">Querying Data</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/docs/configurations.html" class="">Configuration</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/docs/performance.html" class="">Performance</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/docs/deployment.html" class="">Deployment</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -279,7 +279,7 @@
     
       <li>
         
-          <span class="nav__sub-title">INFO</span>
+          <span class="nav__sub-title">其他信息</span>
         
 
         
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/docs/docs-versions.html" class="">Docs Versions</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/docs/privacy.html" class="">Privacy Policy</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -360,10 +360,10 @@
       <li><a href="#step-6c-run-presto-queries">Step 6(c): Run Presto Queries</a></li>
       <li><a href="#step-7--incremental-query-for-copy-on-write-table">Step 7 : Incremental Query for COPY-ON-WRITE Table</a></li>
       <li><a href="#incremental-query-with-spark-sql">Incremental Query with Spark SQL:</a></li>
-      <li><a href="#step-8-schedule-and-run-compaction-for-merge-on-read-table">Step 8: Schedule and Run Compaction for Merge-On-Read table</a></li>
+      <li><a href="#step-8-schedule-and-run-compaction-for-merge-on-read-dataset">Step 8: Schedule and Run Compaction for Merge-On-Read dataset</a></li>
       <li><a href="#step-9-run-hive-queries-including-incremental-queries">Step 9: Run Hive Queries including incremental queries</a></li>
-      <li><a href="#step-10-read-optimized-and-snapshot-queries-for-mor-with-spark-sql-after-compaction">Step 10: Read Optimized and Snapshot queries for MOR with Spark-SQL after compaction</a></li>
-      <li><a href="#step-11--presto-read-optimized-queries-on-mor-table-after-compaction">Step 11:  Presto Read Optimized queries on MOR table after compaction</a></li>
+      <li><a href="#step-10-read-optimized-and-realtime-views-for-mor-with-spark-sql-after-compaction">Step 10: Read Optimized and Realtime Views for MOR with Spark-SQL after compaction</a></li>
+      <li><a href="#step-11--presto-queries-over-read-optimized-view-on-mor-dataset-after-compaction">Step 11:  Presto queries over Read Optimized View on MOR dataset after compaction</a></li>
     </ul>
   </li>
   <li><a href="#testing-hudi-in-local-docker-environment">Testing Hudi in Local Docker environment</a>
@@ -407,7 +407,7 @@ data infrastructure is brought up in a local docker cluster within your computer
 
 <h3 id="build-hudi">Build Hudi</h3>
 
-<p>The first step is to build hudi. <strong>Note</strong> This step builds hudi on default supported scala version - 2.11.</p>
+<p>The first step is to build hudi</p>
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cd</span> <span class="o">&lt;</span><span class="no">HUDI_WORKSPACE</span><span class="o">&gt;</span>
 <span class="n">mvn</span> <span class="kn">package</span> <span class="o">-</span><span class="nc">DskipTests</span>
 </code></pre></div></div>
@@ -428,10 +428,7 @@ This should pull the docker images from docker hub and setup docker cluster.</p>
 <span class="nc">Stopping</span> <span class="n">historyserver</span>             <span class="o">...</span> <span class="n">done</span>
 <span class="o">.......</span>
 <span class="o">......</span>
-<span class="nc">Creating</span> <span class="n">network</span> <span class="s">"compose_default"</span> <span class="n">with</span> <span class="n">the</span> <span class="k">default</span> <span class="n">driver</span>
-<span class="nc">Creating</span> <span class="n">volume</span> <span class="s">"compose_namenode"</span> <span class="n">with</span> <span class="k">default</span> <span class="n">driver</span>
-<span class="nc">Creating</span> <span class="n">volume</span> <span class="s">"compose_historyserver"</span> <span class="n">with</span> <span class="k">default</span> <span class="n">driver</span>
-<span class="nc">Creating</span> <span class="n">volume</span> <span class="s">"compose_hive-metastore-postgresql"</span> <span class="n">with</span> <span class="k">default</span> <span class="n">driver</span>
+<span class="nc">Creating</span> <span class="n">network</span> <span class="s">"hudi_demo"</span> <span class="n">with</span> <span class="n">the</span> <span class="k">default</span> <span class="n">driver</span>
 <span class="nc">Creating</span> <span class="n">hive</span><span class="o">-</span><span class="n">metastore</span><span class="o">-</span><span class="n">postgresql</span> <span class="o">...</span> <span class="n">done</span>
 <span class="nc">Creating</span> <span class="n">namenode</span>                  <span class="o">...</span> <span class="n">done</span>
 <span class="nc">Creating</span> <span class="n">zookeeper</span>                 <span class="o">...</span> <span class="n">done</span>
@@ -464,12 +461,12 @@ This should pull the docker images from docker hub and setup docker cluster.</p>
 
 <h2 id="demo">Demo</h2>
 
-<p>Stock Tracker data will be used to showcase different Hudi query types and the effects of Compaction.</p>
+<p>Stock Tracker data will be used to showcase both different Hudi Views and the effects of Compaction.</p>
 
 <p>Take a look at the directory <code class="highlighter-rouge">docker/demo/data</code>. There are 2 batches of stock data - each at 1 minute granularity.
 The first batch contains stocker tracker data for some stock symbols during the first hour of trading window
 (9:30 a.m to 10:30 a.m). The second batch contains tracker data for next 30 mins (10:30 - 11 a.m). Hudi will
-be used to ingest these batches to a table which will contain the latest stock tracker data at hour level granularity.
+be used to ingest these batches to a dataset which will contain the latest stock tracker data at hour level granularity.
 The batches are windowed intentionally so that the second batch contains updates to some of the rows in the first batch.</p>
 
 <h3 id="step-1--publish-the-first-batch-to-kafka">Step 1 : Publish the first batch to Kafka</h3>
@@ -520,18 +517,18 @@ The batches are windowed intentionally so that the second batch contains updates
 <h3 id="step-2-incrementally-ingest-data-from-kafka-topic">Step 2: Incrementally ingest data from Kafka topic</h3>
 
 <p>Hudi comes with a tool named DeltaStreamer. This tool can connect to variety of data sources (including Kafka) to
-pull changes and apply to Hudi table using upsert/insert primitives. Here, we will use the tool to download
+pull changes and apply to Hudi dataset using upsert/insert primitives. Here, we will use the tool to download
 json data from kafka topic and ingest to both COW and MOR tables we initialized in the previous step. This tool
-automatically initializes the tables in the file-system if they do not exist yet.</p>
+automatically initializes the datasets in the file-system if they do not exist yet.</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
-<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
+<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
 
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
-<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
+<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
 
 
 <span class="err">#</span> <span class="nc">As</span> <span class="n">part</span> <span class="n">of</span> <span class="n">the</span> <span class="nf">setup</span> <span class="o">(</span><span class="nc">Look</span> <span class="n">at</span> <span class="n">setup_demo</span><span class="o">.</span><span class="na">sh</span><span class="o">),</span> <span class="n">the</span> <span class="n">configs</span> <span class="n">needed</span> <span class="k">for</span> <span class="nc">DeltaSt [...]
@@ -540,49 +537,49 @@ automatically initializes the tables in the file-system if they do not exist yet
 <span class="n">exit</span>
 </code></pre></div></div>
 
-<p>You can use HDFS web-browser to look at the tables
+<p>You can use HDFS web-browser to look at the datasets
 <code class="highlighter-rouge">http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_cow</code>.</p>
 
-<p>You can explore the new partition folder created in the table along with a “deltacommit”
+<p>You can explore the new partition folder created in the dataset along with a “deltacommit”
 file under .hoodie which signals a successful commit.</p>
 
-<p>There will be a similar setup when you browse the MOR table
+<p>There will be a similar setup when you browse the MOR dataset
 <code class="highlighter-rouge">http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_mor</code></p>
 
 <h3 id="step-3-sync-with-hive">Step 3: Sync with Hive</h3>
 
-<p>At this step, the tables are available in HDFS. We need to sync with Hive to create new Hive tables and add partitions
-inorder to run Hive queries against those tables.</p>
+<p>At this step, the datasets are available in HDFS. We need to sync with Hive to create new Hive tables and add partitions
+inorder to run Hive queries against those datasets.</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
 
-<span class="err">#</span> <span class="nc">THis</span> <span class="n">command</span> <span class="n">takes</span> <span class="n">in</span> <span class="nc">HIveServer</span> <span class="no">URL</span> <span class="n">and</span> <span class="no">COW</span> <span class="nc">Hudi</span> <span class="n">table</span> <span class="n">location</span> <span class="n">in</span> <span class="no">HDFS</span> <span class="n">and</span> <span class="n">sync</span> <span class="n">the</span> <span [...]
+<span class="err">#</span> <span class="nc">THis</span> <span class="n">command</span> <span class="n">takes</span> <span class="n">in</span> <span class="nc">HIveServer</span> <span class="no">URL</span> <span class="n">and</span> <span class="no">COW</span> <span class="nc">Hudi</span> <span class="nc">Dataset</span> <span class="n">location</span> <span class="n">in</span> <span class="no">HDFS</span> <span class="n">and</span> <span class="n">sync</span> <span class="n">the</span> <s [...]
 <span class="o">/</span><span class="kt">var</span><span class="o">/</span><span class="n">hoodie</span><span class="o">/</span><span class="n">ws</span><span class="o">/</span><span class="n">hudi</span><span class="o">-</span><span class="n">hive</span><span class="o">/</span><span class="n">run_sync_tool</span><span class="o">.</span><span class="na">sh</span>  <span class="o">--</span><span class="n">jdbc</span><span class="o">-</span><span class="n">url</span> <span class="nl">jdbc: [...]
 <span class="o">.....</span>
-<span class="mi">2020</span><span class="o">-</span><span class="mo">01</span><span class="o">-</span><span class="mi">25</span> <span class="mi">19</span><span class="o">:</span><span class="mi">51</span><span class="o">:</span><span class="mi">28</span><span class="o">,</span><span class="mi">953</span> <span class="no">INFO</span>  <span class="o">[</span><span class="n">main</span><span class="o">]</span> <span class="n">hive</span><span class="o">.</span><span class="na">HiveSyncToo [...]
+<span class="mi">2018</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">24</span> <span class="mi">22</span><span class="o">:</span><span class="mi">22</span><span class="o">:</span><span class="mi">45</span><span class="o">,</span><span class="mi">568</span> <span class="no">INFO</span>  <span class="o">[</span><span class="n">main</span><span class="o">]</span> <span class="n">hive</span><span class="o">.</span><span class="na">HiveSyncToo [...]
 <span class="o">.....</span>
 
-<span class="err">#</span> <span class="nc">Now</span> <span class="n">run</span> <span class="n">hive</span><span class="o">-</span><span class="n">sync</span> <span class="k">for</span> <span class="n">the</span> <span class="n">second</span> <span class="n">data</span><span class="o">-</span><span class="n">set</span> <span class="n">in</span> <span class="no">HDFS</span> <span class="n">using</span> <span class="nc">Merge</span><span class="o">-</span><span class="nc">On</span><span  [...]
+<span class="err">#</span> <span class="nc">Now</span> <span class="n">run</span> <span class="n">hive</span><span class="o">-</span><span class="n">sync</span> <span class="k">for</span> <span class="n">the</span> <span class="n">second</span> <span class="n">data</span><span class="o">-</span><span class="n">set</span> <span class="n">in</span> <span class="no">HDFS</span> <span class="n">using</span> <span class="nc">Merge</span><span class="o">-</span><span class="nc">On</span><span  [...]
 <span class="o">/</span><span class="kt">var</span><span class="o">/</span><span class="n">hoodie</span><span class="o">/</span><span class="n">ws</span><span class="o">/</span><span class="n">hudi</span><span class="o">-</span><span class="n">hive</span><span class="o">/</span><span class="n">run_sync_tool</span><span class="o">.</span><span class="na">sh</span>  <span class="o">--</span><span class="n">jdbc</span><span class="o">-</span><span class="n">url</span> <span class="nl">jdbc: [...]
 <span class="o">...</span>
-<span class="mi">2020</span><span class="o">-</span><span class="mo">01</span><span class="o">-</span><span class="mi">25</span> <span class="mi">19</span><span class="o">:</span><span class="mi">51</span><span class="o">:</span><span class="mi">51</span><span class="o">,</span><span class="mo">066</span> <span class="no">INFO</span>  <span class="o">[</span><span class="n">main</span><span class="o">]</span> <span class="n">hive</span><span class="o">.</span><span class="na">HiveSyncToo [...]
+<span class="mi">2018</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">24</span> <span class="mi">22</span><span class="o">:</span><span class="mi">23</span><span class="o">:</span><span class="mi">09</span><span class="o">,</span><span class="mi">171</span> <span class="no">INFO</span>  <span class="o">[</span><span class="n">main</span><span class="o">]</span> <span class="n">hive</span><span class="o">.</span><span class="na">HiveSyncToo [...]
 <span class="o">...</span>
-<span class="mi">2020</span><span class="o">-</span><span class="mo">01</span><span class="o">-</span><span class="mi">25</span> <span class="mi">19</span><span class="o">:</span><span class="mi">51</span><span class="o">:</span><span class="mi">51</span><span class="o">,</span><span class="mi">569</span> <span class="no">INFO</span>  <span class="o">[</span><span class="n">main</span><span class="o">]</span> <span class="n">hive</span><span class="o">.</span><span class="na">HiveSyncToo [...]
+<span class="mi">2018</span><span class="o">-</span><span class="mi">09</span><span class="o">-</span><span class="mi">24</span> <span class="mi">22</span><span class="o">:</span><span class="mi">23</span><span class="o">:</span><span class="mi">09</span><span class="o">,</span><span class="mi">559</span> <span class="no">INFO</span>  <span class="o">[</span><span class="n">main</span><span class="o">]</span> <span class="n">hive</span><span class="o">.</span><span class="na">HiveSyncToo [...]
 <span class="o">....</span>
 <span class="n">exit</span>
 </code></pre></div></div>
 <p>After executing the above command, you will notice</p>
 
 <ol>
-  <li>A hive table named <code class="highlighter-rouge">stock_ticks_cow</code> created which supports Snapshot and Incremental queries on Copy On Write table.</li>
-  <li>Two new tables <code class="highlighter-rouge">stock_ticks_mor_rt</code> and <code class="highlighter-rouge">stock_ticks_mor_ro</code> created for the Merge On Read table. The former
-supports Snapshot and Incremental queries (providing near-real time data) while the later supports ReadOptimized queries.</li>
+  <li>A hive table named <code class="highlighter-rouge">stock_ticks_cow</code> created which provides Read-Optimized view for the Copy On Write dataset.</li>
+  <li>Two new tables <code class="highlighter-rouge">stock_ticks_mor</code> and <code class="highlighter-rouge">stock_ticks_mor_rt</code> created for the Merge On Read dataset. The former
+provides the ReadOptimized view for the Hudi dataset and the later provides the realtime-view for the dataset.</li>
 </ol>
 
 <h3 id="step-4-a-run-hive-queries">Step 4 (a): Run Hive Queries</h3>
 
-<p>Run a hive query to find the latest timestamp ingested for stock symbol ‘GOOG’. You will notice that both snapshot 
-(for both COW and MOR _rt table) and read-optimized queries (for MOR _ro table) give the same value “10:29 a.m” as Hudi create a
+<p>Run a hive query to find the latest timestamp ingested for stock symbol ‘GOOG’. You will notice that both read-optimized
+(for both COW and MOR dataset)and realtime views (for MOR dataset)give the same value “10:29 a.m” as Hudi create a
 parquet file for the first batch of data.</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
@@ -593,10 +590,10 @@ parquet file for the first batch of data.</p>
 <span class="o">|</span>      <span class="n">tab_name</span>       <span class="o">|</span>
 <span class="o">+---------------------+--+</span>
 <span class="o">|</span> <span class="n">stock_ticks_cow</span>     <span class="o">|</span>
-<span class="o">|</span> <span class="n">stock_ticks_mor_ro</span>  <span class="o">|</span>
+<span class="o">|</span> <span class="n">stock_ticks_mor</span>     <span class="o">|</span>
 <span class="o">|</span> <span class="n">stock_ticks_mor_rt</span>  <span class="o">|</span>
 <span class="o">+---------------------+--+</span>
-<span class="mi">3</span> <span class="n">rows</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">1.199</span> <span class="n">seconds</span><span class="o">)</span>
+<span class="mi">2</span> <span class="n">rows</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">0.801</span> <span class="n">seconds</span><span class="o">)</span>
 <span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt;</span>
 
 
@@ -635,11 +632,11 @@ parquet file for the first batch of data.</p>
 <span class="err">#</span> <span class="nc">Merge</span><span class="o">-</span><span class="nc">On</span><span class="o">-</span><span class="nc">Read</span> <span class="nl">Queries:</span>
 <span class="o">==========================</span>
 
-<span class="nc">Lets</span> <span class="n">run</span> <span class="n">similar</span> <span class="n">queries</span> <span class="n">against</span> <span class="no">M</span><span class="o">-</span><span class="no">O</span><span class="o">-</span><span class="no">R</span> <span class="n">table</span><span class="o">.</span> <span class="nc">Lets</span> <span class="n">look</span> <span class="n">at</span> <span class="n">both</span> 
-<span class="nc">ReadOptimized</span> <span class="n">and</span> <span class="nf">Snapshot</span><span class="o">(</span><span class="n">realtime</span> <span class="n">data</span><span class="o">)</span> <span class="n">queries</span> <span class="n">supported</span> <span class="n">by</span> <span class="no">M</span><span class="o">-</span><span class="no">O</span><span class="o">-</span><span class="no">R</span> <span class="n">table</span>
+<span class="nc">Lets</span> <span class="n">run</span> <span class="n">similar</span> <span class="n">queries</span> <span class="n">against</span> <span class="no">M</span><span class="o">-</span><span class="no">O</span><span class="o">-</span><span class="no">R</span> <span class="n">dataset</span><span class="o">.</span> <span class="nc">Lets</span> <span class="n">look</span> <span class="n">at</span> <span class="n">both</span>
+<span class="nc">ReadOptimized</span> <span class="n">and</span> <span class="nc">Realtime</span> <span class="n">views</span> <span class="n">supported</span> <span class="n">by</span> <span class="no">M</span><span class="o">-</span><span class="no">O</span><span class="o">-</span><span class="no">R</span> <span class="n">dataset</span>
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="nc">ReadOptimized</span> <span class="nc">Query</span><span class="o">.</span> <span class="nc">Notice</span> <span class="n">that</span> <span class="n">the</span> <span class="n">latest</span> <span class="n">timestamp</span> <span class="n">is</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span>
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 'GOOG';</span>
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">against</span> <span class="nc">ReadOptimized</span> <span class="nc">View</span><span class="o">.</span> <span class="nc">Notice</span> <span class="n">that</span> <span class="n">the</span> <span class="n">latest</span> <span class="n">timestamp</span> <span class="n">is</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor group by symbol HAVING symbol = 'GOOG';</span>
 <span class="nl">WARNING:</span> <span class="nc">Hive</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="no">MR</span> <span class="n">is</span> <span class="n">deprecated</span> <span class="n">in</span> <span class="nc">Hive</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">may</span> <span class="n">not</span> <span class="n">be</span> <span class="n">available</span> <span class="n">in</span> <span class="n">the</spa [...]
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
@@ -649,7 +646,7 @@ parquet file for the first batch of data.</p>
 <span class="mi">1</span> <span class="n">row</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">6.326</span> <span class="n">seconds</span><span class="o">)</span>
 
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="nc">Snapshot</span> <span class="nc">Query</span><span class="o">.</span> <span class="nc">Notice</span> <span class="n">that</span> <span class="n">the</span> <span class="n">latest</span> <span class="n">timestamp</span> <span class="n">is</span> <span class="n">again</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span>
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">against</span> <span class="nc">Realtime</span> <span class="nc">View</span><span class="o">.</span> <span class="nc">Notice</span> <span class="n">that</span> <span class="n">the</span> <span class="n">latest</span> <span class="n">timestamp</span> <span class="n">is</span> <span class="n">again</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span>
 
 <span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor_rt group by symbol HAVING symbol = 'GOOG';</span>
 <span class="nl">WARNING:</span> <span class="nc">Hive</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="no">MR</span> <span class="n">is</span> <span class="n">deprecated</span> <span class="n">in</span> <span class="nc">Hive</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">may</span> <span class="n">not</span> <span class="n">be</span> <span class="n">available</span> <span class="n">in</span> <span class="n">the</spa [...]
@@ -661,9 +658,9 @@ parquet file for the first batch of data.</p>
 <span class="mi">1</span> <span class="n">row</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">1.606</span> <span class="n">seconds</span><span class="o">)</span>
 
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="n">and</span> <span class="nc">Snapshot</span> <span class="n">project</span> <span class="n">queries</span>
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">projection</span> <span class="n">query</span> <span class="n">against</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="n">and</span> <span class="nc">Realtime</span> <span class="n">tables</span>
 
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol = 'GOOG';</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor where  symbol = 'GOOG';</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 <span class="o">|</span> <span class="n">_hoodie_commit_time</span>  <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>          <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span>  <span class="o">|</span>    <span class="n">open</span>    <span class="o">|</span>   <span class="n">close</span>   <span class="o">|</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
@@ -688,17 +685,17 @@ parquet file for the first batch of data.</p>
 running in spark-sql</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
-<span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">master</span> <span class="n">local</span><span class="o">[</span><span class="mi">2</span><span class="o">]</span> <span class="o">--</span><span class="n">driver< [...]
+<span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">master</span> <span class="n">local</span><span class="o">[</span><span class="mi">2</span><span class="o">]</span> <span class="o">--</span><span class="n">driver< [...]
 <span class="o">...</span>
 
 <span class="nc">Welcome</span> <span class="n">to</span>
       <span class="n">____</span>              <span class="n">__</span>
      <span class="o">/</span> <span class="n">__</span><span class="o">/</span><span class="n">__</span>  <span class="n">___</span> <span class="n">_____</span><span class="o">/</span> <span class="o">/</span><span class="n">__</span>
     <span class="n">_</span><span class="err">\</span> <span class="err">\</span><span class="o">/</span> <span class="n">_</span> <span class="err">\</span><span class="o">/</span> <span class="n">_</span> <span class="err">`</span><span class="o">/</span> <span class="n">__</span><span class="o">/</span>  <span class="err">'</span><span class="n">_</span><span class="o">/</span>
-   <span class="o">/</span><span class="n">___</span><span class="o">/</span> <span class="o">.</span><span class="na">__</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="o">,</span><span class="n">_</span><span class="o">/</span><span class="n">_</span><span class="o">/</span> <span class="o">/</span><span class="n">_</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="err">\</span>   <span class="n">ve [...]
+   <span class="o">/</span><span class="n">___</span><span class="o">/</span> <span class="o">.</span><span class="na">__</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="o">,</span><span class="n">_</span><span class="o">/</span><span class="n">_</span><span class="o">/</span> <span class="o">/</span><span class="n">_</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="err">\</span>   <span class="n">ve [...]
       <span class="o">/</span><span class="n">_</span><span class="o">/</span>
 
-<span class="nc">Using</span> <span class="nc">Scala</span> <span class="n">version</span> <span class="mf">2.11</span><span class="o">.</span><span class="mi">12</span> <span class="o">(</span><span class="nc">OpenJDK</span> <span class="mi">64</span><span class="o">-</span><span class="nc">Bit</span> <span class="nc">Server</span> <span class="no">VM</span><span class="o">,</span> <span class="nc">Java</span> <span class="mf">1.8</span><span class="o">.</span><span class="mi">0_212</sp [...]
+<span class="nc">Using</span> <span class="nc">Scala</span> <span class="n">version</span> <span class="mf">2.11</span><span class="o">.</span><span class="mi">8</span> <span class="o">(</span><span class="nc">Java</span> <span class="nf">HotSpot</span><span class="o">(</span><span class="no">TM</span><span class="o">)</span> <span class="mi">64</span><span class="o">-</span><span class="nc">Bit</span> <span class="nc">Server</span> <span class="no">VM</span><span class="o">,</span> <spa [...]
 <span class="nc">Type</span> <span class="n">in</span> <span class="n">expressions</span> <span class="n">to</span> <span class="n">have</span> <span class="n">them</span> <span class="n">evaluated</span><span class="o">.</span>
 <span class="nc">Type</span> <span class="o">:</span><span class="n">help</span> <span class="k">for</span> <span class="n">more</span> <span class="n">information</span><span class="o">.</span>
 
@@ -708,7 +705,7 @@ running in spark-sql</p>
 <span class="o">|</span><span class="n">database</span><span class="o">|</span><span class="n">tableName</span>         <span class="o">|</span><span class="n">isTemporary</span><span class="o">|</span>
 <span class="o">+--------+------------------+-----------+</span>
 <span class="o">|</span><span class="k">default</span> <span class="o">|</span><span class="n">stock_ticks_cow</span>   <span class="o">|</span><span class="kc">false</span>      <span class="o">|</span>
-<span class="o">|</span><span class="k">default</span> <span class="o">|</span><span class="n">stock_ticks_mor_ro</span><span class="o">|</span><span class="kc">false</span>      <span class="o">|</span>
+<span class="o">|</span><span class="k">default</span> <span class="o">|</span><span class="n">stock_ticks_mor</span>   <span class="o">|</span><span class="kc">false</span>      <span class="o">|</span>
 <span class="o">|</span><span class="k">default</span> <span class="o">|</span><span class="n">stock_ticks_mor_rt</span><span class="o">|</span><span class="kc">false</span>      <span class="o">|</span>
 <span class="o">+--------+------------------+-----------+</span>
 
@@ -739,11 +736,11 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</s
 # Merge-On-Read Queries:
 ==========================
 
-Lets run similar queries against M-O-R table. Lets look at both
-ReadOptimized and Snapshot queries supported by M-O-R table
+Lets run similar queries against M-O-R dataset. Lets look at both
+ReadOptimized and Realtime views supported by M-O-R dataset
 
-# Run ReadOptimized Query. Notice that the latest timestamp is 10:29
-scala&gt; spark.sql("</span><span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor_ro</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <span class="o">=</span> <span class="err">'</span><span class=" [...]
+# Run against ReadOptimized View. Notice that the latest timestamp is 10:29
+scala&gt; spark.sql("</span><span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <span class="o">=</span> <span class="err">'</span><span class="no" [...]
 +------+-------------------+
 |symbol|max(ts)            |
 +------+-------------------+
@@ -751,7 +748,7 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="n">symbol
 +------+-------------------+
 
 
-# Run Snapshot Query. Notice that the latest timestamp is again 10:29
+# Run against Realtime View. Notice that the latest timestamp is again 10:29
 
 scala&gt; spark.sql("</span><span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor_rt</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <span class="o">=</span> <span class="err">'</span><span class=" [...]
 +------+-------------------+
@@ -760,9 +757,9 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="n">symbol
 |GOOG  |2018-08-31 10:29:00|
 +------+-------------------+
 
-# Run Read Optimized and Snapshot project queries
+# Run projection query against Read Optimized and Realtime tables
 
-scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</span><span class="n">_hoodie_commit_time</span><span class="err">`</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor_ro</span> <span [...]
+scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</span><span class="n">_hoodie_commit_time</span><span class="err">`</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor</span> <span cl [...]
 +-------------------+------+-------------------+------+---------+--------+
 |_hoodie_commit_time|symbol|ts                 |volume|open     |close   |
 +-------------------+------+-------------------+------+---------+--------+
@@ -782,7 +779,7 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</s
 
 <h3 id="step-4-c-run-presto-queries">Step 4 (c): Run Presto Queries</h3>
 
-<p>Here are the Presto queries for similar Hive and Spark queries. Currently, Presto does not support snapshot or incremental queries on Hudi tables.</p>
+<p>Here are the Presto queries for similar Hive and Spark queries. Currently, Hudi does not support Presto queries on realtime views.</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">presto</span><span class="o">-</span><span class="n">worker</span><span class="o">-</span><span class="mi">1</span> <span class="n">presto</span> <span class="o">--</span><span class="n">server</span> <span class="n">presto</span><span class="o">-</span><span class="n">c [...]
 <span class="n">presto</span><span class="o">&gt;</span> <span class="n">show</span> <span class="n">catalogs</span><span class="o">;</span>
@@ -804,7 +801,7 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</s
        <span class="nc">Table</span>
 <span class="o">--------------------</span>
  <span class="n">stock_ticks_cow</span>
- <span class="n">stock_ticks_mor_ro</span>
+ <span class="n">stock_ticks_mor</span>
  <span class="nf">stock_ticks_mor_rt</span>
 <span class="o">(</span><span class="mi">3</span> <span class="n">rows</span><span class="o">)</span>
 
@@ -842,10 +839,10 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</s
 <span class="err">#</span> <span class="nc">Merge</span><span class="o">-</span><span class="nc">On</span><span class="o">-</span><span class="nc">Read</span> <span class="nl">Queries:</span>
 <span class="o">==========================</span>
 
-<span class="nc">Lets</span> <span class="n">run</span> <span class="n">similar</span> <span class="n">queries</span> <span class="n">against</span> <span class="no">M</span><span class="o">-</span><span class="no">O</span><span class="o">-</span><span class="no">R</span> <span class="n">table</span><span class="o">.</span> 
+<span class="nc">Lets</span> <span class="n">run</span> <span class="n">similar</span> <span class="n">queries</span> <span class="n">against</span> <span class="no">M</span><span class="o">-</span><span class="no">O</span><span class="o">-</span><span class="no">R</span> <span class="n">dataset</span><span class="o">.</span> 
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="nc">ReadOptimized</span> <span class="nc">Query</span><span class="o">.</span> <span class="nc">Notice</span> <span class="n">that</span> <span class="n">the</span> <span class="n">latest</span> <span class="n">timestamp</span> <span class="n">is</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span>
-    <span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor_ro</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> [...]
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">against</span> <span class="nc">ReadOptimized</span> <span class="nc">View</span><span class="o">.</span> <span class="nc">Notice</span> <span class="n">that</span> <span class="n">the</span> <span class="n">latest</span> <span class="n">timestamp</span> <span class="n">is</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span>
+<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <span  [...]
  <span class="n">symbol</span> <span class="o">|</span>        <span class="n">_col1</span>
 <span class="o">--------+---------------------</span>
  <span class="no">GOOG</span>   <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span><span class="o">:</span><span class="mo">00</span>
@@ -856,7 +853,7 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</s
 <span class="mi">0</span><span class="o">:</span><span class="mo">02</span> <span class="o">[</span><span class="mi">197</span> <span class="n">rows</span><span class="o">,</span> <span class="mi">613</span><span class="no">B</span><span class="o">]</span> <span class="o">[</span><span class="mi">110</span> <span class="n">rows</span><span class="o">/</span><span class="n">s</span><span class="o">,</span> <span class="mi">343</span><span class="no">B</span><span class="o">/</span><span c [...]
 
 
-<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span>  <span class="n">select</span> <span class="s">"_hoodie_commit_time"</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor_ro< [...]
+<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span>  <span class="n">select</span> <span class="s">"_hoodie_commit_time"</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor</sp [...]
  <span class="n">_hoodie_commit_time</span> <span class="o">|</span> <span class="n">symbol</span> <span class="o">|</span>         <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span> <span class="o">|</span>   <span class="n">open</span>    <span class="o">|</span>  <span class="n">close</span>
 <span class="o">---------------------+--------+---------------------+--------+-----------+----------</span>
  <span class="mi">20190822180250</span>      <span class="o">|</span> <span class="no">GOOG</span>   <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">09</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span> <span class="o">|</span>   <span class="mi">6330</span> <span class="o">|</span>    <span class="mf">1230.5</s [...]
@@ -880,12 +877,12 @@ partitions, there is no need to run hive-sync</p>
 <span class="err">#</span> <span class="nc">Within</span> <span class="nc">Docker</span> <span class="n">container</span><span class="o">,</span> <span class="n">run</span> <span class="n">the</span> <span class="n">ingestion</span> <span class="n">command</span>
 <span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
-<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
+<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
 
 
-<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
-<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
+<span class="err">#</span> <span class="nc">Run</span> <span class="n">the</span> <span class="n">following</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="n">command</span> <span class="n">to</span> <span class="n">execute</span> <span class="n">the</span> <span class="n">delta</span><span class="o">-</span><span class="n">streamer</span> <span class="n">and</span> <span class="n">ingest</span> <span class="n">to</span> <span class=" [...]
+<span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">deltastreamer</span><span class="o">.</span><span class="na">HoodieDeltaStreamer</span> <span class="n">$HUDI_UTILITIES_BUND [...]
 
 <span class="n">exit</span>
 </code></pre></div></div>
@@ -898,12 +895,12 @@ Take a look at the HDFS filesystem to get an idea: <code class="highlighter-roug
 
 <h3 id="step-6a-run-hive-queries">Step 6(a): Run Hive Queries</h3>
 
-<p>With Copy-On-Write table, the Snapshot query immediately sees the changes as part of second batch once the batch
+<p>With Copy-On-Write table, the read-optimized view immediately sees the changes as part of second batch once the batch
 got committed as each ingestion creates newer versions of parquet files.</p>
 
 <p>With Merge-On-Read table, the second ingestion merely appended the batch to an unmerged delta (log) file.
-This is the time, when ReadOptimized and Snapshot queries will provide different results. ReadOptimized query will still
-return “10:29 am” as it will only read from the Parquet file. Snapshot query will do on-the-fly merge and return
+This is the time, when ReadOptimized and Realtime views will provide different results. ReadOptimized view will still
+return “10:29 am” as it will only read from the Parquet file. Realtime View will do on-the-fly merge and return
 latest committed data which is “10:59 a.m”.</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
@@ -933,8 +930,8 @@ latest committed data which is “10:59 a.m”.</p>
 
 <span class="err">#</span> <span class="nc">Merge</span> <span class="nc">On</span> <span class="nc">Read</span> <span class="nl">Table:</span>
 
-<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">Query</span>
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 'GOOG';</span>
+<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">View</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor group by symbol HAVING symbol = 'GOOG';</span>
 <span class="nl">WARNING:</span> <span class="nc">Hive</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="no">MR</span> <span class="n">is</span> <span class="n">deprecated</span> <span class="n">in</span> <span class="nc">Hive</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">may</span> <span class="n">not</span> <span class="n">be</span> <span class="n">available</span> <span class="n">in</span> <span class="n">the</spa [...]
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
@@ -943,7 +940,7 @@ latest committed data which is “10:59 a.m”.</p>
 <span class="o">+---------+----------------------+--+</span>
 <span class="mi">1</span> <span class="n">row</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">1.6</span> <span class="n">seconds</span><span class="o">)</span>
 
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol = 'GOOG';</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor where  symbol = 'GOOG';</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 <span class="o">|</span> <span class="n">_hoodie_commit_time</span>  <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>          <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span>  <span class="o">|</span>    <span class="n">open</span>    <span class="o">|</span>   <span class="n">close</span>   <span class="o">|</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
@@ -951,7 +948,7 @@ latest committed data which is “10:59 a.m”.</p>
 <span class="o">|</span> <span class="mi">20180924222155</span>       <span class="o">|</span> <span class="no">GOOG</span>    <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span><span class="o">:</span><span class="mo">00</span>  <span class="o">|</span> <span class="mi">3391</span>    <span class="o">|</span> < [...]
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 
-<span class="err">#</span> <span class="nc">Snapshot</span> <span class="nc">Query</span>
+<span class="err">#</span> <span class="nc">Realtime</span> <span class="nc">View</span>
 <span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor_rt group by symbol HAVING symbol = 'GOOG';</span>
 <span class="nl">WARNING:</span> <span class="nc">Hive</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="no">MR</span> <span class="n">is</span> <span class="n">deprecated</span> <span class="n">in</span> <span class="nc">Hive</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">may</span> <span class="n">not</span> <span class="n">be</span> <span class="n">available</span> <span class="n">in</span> <span class="n">the</spa [...]
 <span class="o">+---------+----------------------+--+</span>
@@ -977,7 +974,7 @@ latest committed data which is “10:59 a.m”.</p>
 <p>Running the same queries in Spark-SQL:</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
-<span class="n">bash</span><span class="o">-</span><span class="mf">4.4</span><span class="err">#</span> <span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">driver</span><span class="o">-</span><span class="kd">class [...]
+<span class="n">bash</span><span class="o">-</span><span class="mf">4.4</span><span class="err">#</span> <span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">driver</span><span class="o">-</span><span class="kd">class [...]
 
 <span class="err">#</span> <span class="nc">Copy</span> <span class="nc">On</span> <span class="nc">Write</span> <span class="nl">Table:</span>
 
@@ -1002,8 +999,8 @@ latest committed data which is “10:59 a.m”.</p>
 
 <span class="err">#</span> <span class="nc">Merge</span> <span class="nc">On</span> <span class="nc">Read</span> <span class="nl">Table:</span>
 
-<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">Query</span>
-<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
+<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">View</span>
+<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select symbol, max(ts) from stock_ticks_mor group by symbol HAVING symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
 <span class="o">+---------+----------------------+--+</span>
@@ -1011,7 +1008,7 @@ latest committed data which is “10:59 a.m”.</p>
 <span class="o">+---------+----------------------+--+</span>
 <span class="mi">1</span> <span class="n">row</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">1.6</span> <span class="n">seconds</span><span class="o">)</span>
 
-<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
+<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor where  symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 <span class="o">|</span> <span class="n">_hoodie_commit_time</span>  <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>          <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span>  <span class="o">|</span>    <span class="n">open</span>    <span class="o">|</span>   <span class="n">close</span>   <span class="o">|</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
@@ -1019,7 +1016,7 @@ latest committed data which is “10:59 a.m”.</p>
 <span class="o">|</span> <span class="mi">20180924222155</span>       <span class="o">|</span> <span class="no">GOOG</span>    <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span><span class="o">:</span><span class="mo">00</span>  <span class="o">|</span> <span class="mi">3391</span>    <span class="o">|</span> < [...]
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 
-<span class="err">#</span> <span class="nc">Snapshot</span> <span class="nc">Query</span>
+<span class="err">#</span> <span class="nc">Realtime</span> <span class="nc">View</span>
 <span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select symbol, max(ts) from stock_ticks_mor_rt group by symbol HAVING symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
@@ -1041,7 +1038,7 @@ latest committed data which is “10:59 a.m”.</p>
 
 <h3 id="step-6c-run-presto-queries">Step 6(c): Run Presto Queries</h3>
 
-<p>Running the same queries on Presto for ReadOptimized queries.</p>
+<p>Running the same queries on Presto for ReadOptimized views.</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">presto</span><span class="o">-</span><span class="n">worker</span><span class="o">-</span><span class="mi">1</span> <span class="n">presto</span> <span class="o">--</span><span class="n">server</span> <span class="n">presto</span><span class="o">-</span><span class="n">c [...]
 <span class="n">presto</span><span class="o">&gt;</span> <span class="n">use</span> <span class="n">hive</span><span class="o">.</span><span class="na">default</span><span class="o">;</span>
@@ -1075,8 +1072,8 @@ latest committed data which is “10:59 a.m”.</p>
 
 <span class="err">#</span> <span class="nc">Merge</span> <span class="nc">On</span> <span class="nc">Read</span> <span class="nl">Table:</span>
 
-<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">Query</span>
-<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor_ro</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <sp [...]
+<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">View</span>
+<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <span  [...]
  <span class="n">symbol</span> <span class="o">|</span>        <span class="n">_col1</span>
 <span class="o">--------+---------------------</span>
  <span class="no">GOOG</span>   <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">29</span><span class="o">:</span><span class="mo">00</span>
@@ -1086,7 +1083,7 @@ latest committed data which is “10:59 a.m”.</p>
 <span class="nl">Splits:</span> <span class="mi">49</span> <span class="n">total</span><span class="o">,</span> <span class="mi">49</span> <span class="n">done</span> <span class="o">(</span><span class="mf">100.00</span><span class="o">%)</span>
 <span class="mi">0</span><span class="o">:</span><span class="mo">01</span> <span class="o">[</span><span class="mi">197</span> <span class="n">rows</span><span class="o">,</span> <span class="mi">613</span><span class="no">B</span><span class="o">]</span> <span class="o">[</span><span class="mi">139</span> <span class="n">rows</span><span class="o">/</span><span class="n">s</span><span class="o">,</span> <span class="mi">435</span><span class="no">B</span><span class="o">/</span><span c [...]
 
-<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span><span class="n">select</span> <span class="s">"_hoodie_commit_time"</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor_ro</s [...]
+<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span><span class="n">select</span> <span class="s">"_hoodie_commit_time"</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor</span [...]
  <span class="n">_hoodie_commit_time</span> <span class="o">|</span> <span class="n">symbol</span> <span class="o">|</span>         <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span> <span class="o">|</span>   <span class="n">open</span>    <span class="o">|</span>  <span class="n">close</span>
 <span class="o">---------------------+--------+---------------------+--------+-----------+----------</span>
  <span class="mi">20190822180250</span>      <span class="o">|</span> <span class="no">GOOG</span>   <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">09</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span> <span class="o">|</span>   <span class="mi">6330</span> <span class="o">|</span>    <span class="mf">1230.5</s [...]
@@ -1102,7 +1099,7 @@ latest committed data which is “10:59 a.m”.</p>
 
 <h3 id="step-7--incremental-query-for-copy-on-write-table">Step 7 : Incremental Query for COPY-ON-WRITE Table</h3>
 
-<p>With 2 batches of data ingested, lets showcase the support for incremental queries in Hudi Copy-On-Write tables</p>
+<p>With 2 batches of data ingested, lets showcase the support for incremental queries in Hudi Copy-On-Write datasets</p>
 
 <p>Lets take the same projection query example</p>
 
@@ -1154,15 +1151,15 @@ Here is the incremental query :</p>
 
 <h3 id="incremental-query-with-spark-sql">Incremental Query with Spark SQL:</h3>
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
-<span class="n">bash</span><span class="o">-</span><span class="mf">4.4</span><span class="err">#</span> <span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">driver</span><span class="o">-</span><span class="kd">class [...]
+<span class="n">bash</span><span class="o">-</span><span class="mf">4.4</span><span class="err">#</span> <span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">driver</span><span class="o">-</span><span class="kd">class [...]
 <span class="nc">Welcome</span> <span class="n">to</span>
       <span class="n">____</span>              <span class="n">__</span>
      <span class="o">/</span> <span class="n">__</span><span class="o">/</span><span class="n">__</span>  <span class="n">___</span> <span class="n">_____</span><span class="o">/</span> <span class="o">/</span><span class="n">__</span>
     <span class="n">_</span><span class="err">\</span> <span class="err">\</span><span class="o">/</span> <span class="n">_</span> <span class="err">\</span><span class="o">/</span> <span class="n">_</span> <span class="err">`</span><span class="o">/</span> <span class="n">__</span><span class="o">/</span>  <span class="err">'</span><span class="n">_</span><span class="o">/</span>
-   <span class="o">/</span><span class="n">___</span><span class="o">/</span> <span class="o">.</span><span class="na">__</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="o">,</span><span class="n">_</span><span class="o">/</span><span class="n">_</span><span class="o">/</span> <span class="o">/</span><span class="n">_</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="err">\</span>   <span class="n">ve [...]
+   <span class="o">/</span><span class="n">___</span><span class="o">/</span> <span class="o">.</span><span class="na">__</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="o">,</span><span class="n">_</span><span class="o">/</span><span class="n">_</span><span class="o">/</span> <span class="o">/</span><span class="n">_</span><span class="o">/</span><span class="err">\</span><span class="n">_</span><span class="err">\</span>   <span class="n">ve [...]
       <span class="o">/</span><span class="n">_</span><span class="o">/</span>
 
-<span class="nc">Using</span> <span class="nc">Scala</span> <span class="n">version</span> <span class="mf">2.11</span><span class="o">.</span><span class="mi">12</span> <span class="o">(</span><span class="nc">OpenJDK</span> <span class="mi">64</span><span class="o">-</span><span class="nc">Bit</span> <span class="nc">Server</span> <span class="no">VM</span><span class="o">,</span> <span class="nc">Java</span> <span class="mf">1.8</span><span class="o">.</span><span class="mi">0_212</sp [...]
+<span class="nc">Using</span> <span class="nc">Scala</span> <span class="n">version</span> <span class="mf">2.11</span><span class="o">.</span><span class="mi">8</span> <span class="o">(</span><span class="nc">Java</span> <span class="nf">HotSpot</span><span class="o">(</span><span class="no">TM</span><span class="o">)</span> <span class="mi">64</span><span class="o">-</span><span class="nc">Bit</span> <span class="nc">Server</span> <span class="no">VM</span><span class="o">,</span> <spa [...]
 <span class="nc">Type</span> <span class="n">in</span> <span class="n">expressions</span> <span class="n">to</span> <span class="n">have</span> <span class="n">them</span> <span class="n">evaluated</span><span class="o">.</span>
 <span class="nc">Type</span> <span class="o">:</span><span class="n">help</span> <span class="k">for</span> <span class="n">more</span> <span class="n">information</span><span class="o">.</span>
 
@@ -1170,7 +1167,7 @@ Here is the incremental query :</p>
 <span class="kn">import</span> <span class="nn">org.apache.hudi.DataSourceReadOptions</span>
 
 <span class="err">#</span> <span class="nc">In</span> <span class="n">the</span> <span class="n">below</span> <span class="n">query</span><span class="o">,</span> <span class="mi">20180925045257</span> <span class="n">is</span> <span class="n">the</span> <span class="n">first</span> <span class="n">commit</span><span class="err">'</span><span class="n">s</span> <span class="n">timestamp</span>
-<span class="n">scala</span><span class="o">&gt;</span> <span class="n">val</span> <span class="n">hoodieIncViewDF</span> <span class="o">=</span>  <span class="n">spark</span><span class="o">.</span><span class="na">read</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceReadOptions</span><span class="o">.</spa [...]
+<span class="n">scala</span><span class="o">&gt;</span> <span class="n">val</span> <span class="n">hoodieIncViewDF</span> <span class="o">=</span>  <span class="n">spark</span><span class="o">.</span><span class="na">read</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceReadOptions</span><span class="o">.</spa [...]
 <span class="nl">SLF4J:</span> <span class="nc">Failed</span> <span class="n">to</span> <span class="n">load</span> <span class="kd">class</span> <span class="err">"</span><span class="nc">org</span><span class="o">.</span><span class="na">slf4j</span><span class="o">.</span><span class="na">impl</span><span class="o">.</span><span class="na">StaticLoggerBinder</span><span class="s">".
 SLF4J: Defaulting to no-operation (NOP) logger implementation
 SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
@@ -1188,38 +1185,31 @@ scala&gt; spark.sql("</span><span class="n">select</span> <span class="err">`</s
 
 </code></pre></div></div>
 
-<h3 id="step-8-schedule-and-run-compaction-for-merge-on-read-table">Step 8: Schedule and Run Compaction for Merge-On-Read table</h3>
+<h3 id="step-8-schedule-and-run-compaction-for-merge-on-read-dataset">Step 8: Schedule and Run Compaction for Merge-On-Read dataset</h3>
 
 <p>Lets schedule and run a compaction to create a new version of columnar  file so that read-optimized readers will see fresher data.
 Again, You can use Hudi CLI to manually schedule and run compaction</p>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
 <span class="n">root</span><span class="nd">@adhoc</span><span class="o">-</span><span class="mi">1</span><span class="o">:/</span><span class="n">opt</span><span class="err">#</span>   <span class="o">/</span><span class="kt">var</span><span class="o">/</span><span class="n">hoodie</span><span class="o">/</span><span class="n">ws</span><span class="o">/</span><span class="n">hudi</span><span class="o">-</span><span class="n">cli</span><span class="o">/</span><span class="n">hudi</span>< [...]
-<span class="o">...</span>
-<span class="nc">Table</span> <span class="n">command</span> <span class="n">getting</span> <span class="n">loaded</span>
-<span class="nc">HoodieSplashScreen</span> <span class="n">loaded</span>
-<span class="o">===================================================================</span>
-<span class="o">*</span>         <span class="n">___</span>                          <span class="n">___</span>                        <span class="o">*</span>
-<span class="o">*</span>        <span class="o">/</span><span class="err">\</span><span class="n">__</span><span class="err">\</span>          <span class="n">___</span>           <span class="o">/</span><span class="err">\</span>  <span class="err">\</span>           <span class="n">___</span>         <span class="o">*</span>
-<span class="o">*</span>       <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>         <span class="o">/</span><span class="err">\</span><span class="n">__</span><span class="err">\</span>         <span class="o">/</span>  <span class="err">\</span>  <span class="err">\</span>         <span class="o">/</span><span class="err">\</span>  <span class="err">\</span>        <span class="o">*</span>
-<span class="o">*</span>      <span class="o">/</span> <span class="o">/</span><span class="n">__</span><span class="o">/</span>         <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>        <span class="o">/</span> <span class="o">/</span><span class="err">\</span> <span class="err">\</span>  <span class="err">\</span>        <span class="err">\</span> <span class="err">\</span>  <span class="err">\</span>       <span class="o">*</span>
-<span class="o">*</span>     <span class="o">/</span>  <span class="err">\</span>  <span class="err">\</span> <span class="n">___</span>    <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>        <span class="o">/</span> <span class="o">/</span>  <span class="err">\</span> <span class="err">\</span><span class="n">__</span><span class="err">\</span>       <span class="o">/</span>  <span class="err">\</span><span class="n">__</span><span class="err">\</span>     [...]
-<span class="o">*</span>    <span class="o">/</span> <span class="o">/</span><span class="err">\</span> <span class="err">\</span>  <span class="o">/</span><span class="err">\</span><span class="n">__</span><span class="err">\</span>  <span class="o">/</span> <span class="o">/</span><span class="n">__</span><span class="o">/</span>  <span class="n">___</span>   <span class="o">/</span> <span class="o">/</span><span class="n">__</span><span class="o">/</span> <span class="err">\</span> <s [...]
-<span class="o">*</span>    <span class="err">\</span><span class="o">/</span>  <span class="err">\</span> <span class="err">\</span><span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>  <span class="err">\</span> <span class="err">\</span>  <span class="err">\</span> <span class="o">/</span><span class="err">\</span><span class="n">__</span><span class="err">\</span>  <span class="err">\</span> <span class="err">\</span>  <span class="err">\</span> <span class="o" [...]
-<span class="o">*</span>         <span class="err">\</span>  <span class="o">/</span>  <span class="o">/</span>    <span class="err">\</span> <span class="err">\</span>  <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>   <span class="err">\</span> <span class="err">\</span>  <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>   <span class="err">\</span>  <span class="o">/</span><span class="n">__</span><span class="o">/</span>           [...]
-<span class="o">*</span>         <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>      <span class="err">\</span> <span class="err">\</span><span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>     <span class="err">\</span> <span class="err">\</span><span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>     <span class="err">\</span> <span class="err">\</span><span class="n">__</span><span class="err">\</span>         [...]
-<span class="o">*</span>        <span class="o">/</span> <span class="o">/</span>  <span class="o">/</span>        <span class="err">\</span>  <span class="o">/</span>  <span class="o">/</span>       <span class="err">\</span>  <span class="o">/</span>  <span class="o">/</span>       <span class="err">\</span><span class="o">/</span><span class="n">__</span><span class="o">/</span>          <span class="o">*</span>
-<span class="o">*</span>        <span class="err">\</span><span class="o">/</span><span class="n">__</span><span class="o">/</span>          <span class="err">\</span><span class="o">/</span><span class="n">__</span><span class="o">/</span>         <span class="err">\</span><span class="o">/</span><span class="n">__</span><span class="o">/</span>    <span class="nc">Apache</span> <span class="nc">Hudi</span> <span class="no">CLI</span>    <span class="o">*</span>
-<span class="o">*</span>                                                                 <span class="o">*</span>
-<span class="o">===================================================================</span>
-
-<span class="nc">Welcome</span> <span class="n">to</span> <span class="nc">Apache</span> <span class="nc">Hudi</span> <span class="no">CLI</span><span class="o">.</span> <span class="nc">Please</span> <span class="n">type</span> <span class="n">help</span> <span class="k">if</span> <span class="n">you</span> <span class="n">are</span> <span class="n">looking</span> <span class="k">for</span> <span class="n">help</span><span class="o">.</span>
+<span class="o">============================================</span>
+<span class="o">*</span>                                          <span class="o">*</span>
+<span class="o">*</span>     <span class="n">_</span>    <span class="n">_</span>           <span class="n">_</span>   <span class="n">_</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span> <span class="o">|</span>  <span class="o">|</span> <span class="o">|</span>         <span class="o">|</span> <span class="o">|</span> <span class="o">(</span><span class="n">_</span><span class="o">)</span>              <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span> <span class="o">|</span><span class="n">__</span><span class="o">|</span> <span class="o">|</span>       <span class="n">__</span><span class="o">|</span> <span class="o">|</span>  <span class="o">-</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span>  <span class="n">__</span>  <span class="o">||</span>   <span class="o">|</span> <span class="o">/</span> <span class="n">_</span><span class="err">`</span> <span class="o">|</span> <span class="o">||</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span> <span class="o">|</span>  <span class="o">|</span> <span class="o">||</span>   <span class="o">||</span> <span class="o">(</span><span class="n">_</span><span class="o">|</span> <span class="o">|</span> <span class="o">||</span>               <span class="o">*</span>
+<span class="o">*</span>    <span class="o">|</span><span class="n">_</span><span class="o">|</span>  <span class="o">|</span><span class="n">_</span><span class="o">|</span><span class="err">\</span><span class="n">___</span><span class="o">/</span> <span class="err">\</span><span class="n">____</span><span class="o">/</span> <span class="o">||</span>               <span class="o">*</span>
+<span class="o">*</span>                                          <span class="o">*</span>
+<span class="o">============================================</span>
+
+<span class="nc">Welcome</span> <span class="n">to</span> <span class="nc">Hoodie</span> <span class="no">CLI</span><span class="o">.</span> <span class="nc">Please</span> <span class="n">type</span> <span class="n">help</span> <span class="k">if</span> <span class="n">you</span> <span class="n">are</span> <span class="n">looking</span> <span class="k">for</span> <span class="n">help</span><span class="o">.</span>
 <span class="n">hudi</span><span class="o">-&gt;</span><span class="n">connect</span> <span class="o">--</span><span class="n">path</span> <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">hive</span><span class="o">/</span><span class="n">warehouse</span><span class="o">/</span><span class="n">stock_ticks_mor</span>
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">34</span> <span class="no">WARN</span> <span class="n">util</span><span class="o">.</span><span class="na">NativeCodeLoader</span><span class="o">:</span> <span class="nc">Unable</span> <span class="n">to</span> <span class="n">load</span> <span cl [...]
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">35</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Loading</span> <span class="nc">HoodieTableMetaClient</span> <span cla [...]
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">35</span> <span class="no">INFO</span> <span class="n">util</span><span class="o">.</span><span class="na">FSUtils</span><span class="o">:</span> <span class="nc">Hadoop</span> <span class="nl">Configuration:</span> <span class="n">fs</span><span c [...]
-<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">35</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableConfig</span><span class="o">:</span> <span class="nc">Loading</span> <span class="n">table</span> <span class="n">properties</sp [...]
-<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">36</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">35</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableConfig</span><span class="o">:</span> <span class="nc">Loading</span> <span class="n">dataset</span> <span class="n">properties</ [...]
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">06</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mi">36</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
 <span class="nc">Metadata</span> <span class="k">for</span> <span class="n">table</span> <span class="n">stock_ticks_mor</span> <span class="n">loaded</span>
 
 <span class="err">#</span> <span class="nc">Ensure</span> <span class="n">no</span> <span class="n">compactions</span> <span class="n">are</span> <span class="n">present</span>
@@ -1243,8 +1233,8 @@ Again, You can use Hudi CLI to manually schedule and run compaction</p>
 <span class="nl">hoodie:</span><span class="n">stock_ticks</span><span class="o">-&gt;</span><span class="n">connect</span> <span class="o">--</span><span class="n">path</span> <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">hive</span><span class="o">/</span><span class="n">warehouse</span><span class="o">/</span><span class="n">stock_ticks_mor</span>
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">01</span><span class="o">:</span><span class="mi">16</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Loading</span> <span class="nc">HoodieTableMetaClient</span> <span cla [...]
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">01</span><span class="o">:</span><span class="mi">16</span> <span class="no">INFO</span> <span class="n">util</span><span class="o">.</span><span class="na">FSUtils</span><span class="o">:</span> <span class="nc">Hadoop</span> <span class="nl">Configuration:</span> <span class="n">fs</span><span c [...]
-<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">01</span><span class="o">:</span><span class="mi">16</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableConfig</span><span class="o">:</span> <span class="nc">Loading</span> <span class="n">table</span> <span class="n">properties</sp [...]
-<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">01</span><span class="o">:</span><span class="mi">16</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">01</span><span class="o">:</span><span class="mi">16</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableConfig</span><span class="o">:</span> <span class="nc">Loading</span> <span class="n">dataset</span> <span class="n">properties</ [...]
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">01</span><span class="o">:</span><span class="mi">16</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
 <span class="nc">Metadata</span> <span class="k">for</span> <span class="n">table</span> <span class="n">stock_ticks_mor</span> <span class="n">loaded</span>
 
 
@@ -1270,8 +1260,8 @@ Again, You can use Hudi CLI to manually schedule and run compaction</p>
 <span class="nl">hoodie:</span><span class="n">stock_ticks_mor</span><span class="o">-&gt;</span><span class="n">connect</span> <span class="o">--</span><span class="n">path</span> <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">hive</span><span class="o">/</span><span class="n">warehouse</span><span class="o">/</span><span class="n">stock_ticks_mor</span>
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">03</span><span class="o">:</span><span class="mo">00</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Loading</span> <span class="nc">HoodieTableMetaClient</span> <span cla [...]
 <span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">03</span><span class="o">:</span><span class="mo">00</span> <span class="no">INFO</span> <span class="n">util</span><span class="o">.</span><span class="na">FSUtils</span><span class="o">:</span> <span class="nc">Hadoop</span> <span class="nl">Configuration:</span> <span class="n">fs</span><span c [...]
-<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">03</span><span class="o">:</span><span class="mo">00</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableConfig</span><span class="o">:</span> <span class="nc">Loading</span> <span class="n">table</span> <span class="n">properties</sp [...]
-<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">03</span><span class="o">:</span><span class="mo">00</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">03</span><span class="o">:</span><span class="mo">00</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableConfig</span><span class="o">:</span> <span class="nc">Loading</span> <span class="n">dataset</span> <span class="n">properties</ [...]
+<span class="mi">18</span><span class="o">/</span><span class="mi">09</span><span class="o">/</span><span class="mi">24</span> <span class="mo">07</span><span class="o">:</span><span class="mo">03</span><span class="o">:</span><span class="mo">00</span> <span class="no">INFO</span> <span class="n">table</span><span class="o">.</span><span class="na">HoodieTableMetaClient</span><span class="o">:</span> <span class="nc">Finished</span> <span class="nc">Loading</span> <span class="nc">Table [...]
 <span class="nc">Metadata</span> <span class="k">for</span> <span class="n">table</span> <span class="n">stock_ticks_mor</span> <span class="n">loaded</span>
 
 
@@ -1287,7 +1277,7 @@ Again, You can use Hudi CLI to manually schedule and run compaction</p>
 
 <h3 id="step-9-run-hive-queries-including-incremental-queries">Step 9: Run Hive Queries including incremental queries</h3>
 
-<p>You will see that both ReadOptimized and Snapshot queries will show the latest committed data.
+<p>You will see that both ReadOptimized and Realtime Views will show the latest committed data.
 Lets also run the incremental query for MOR table.
 From looking at the below query output, it will be clear that the fist commit time for the MOR table is 20180924064636
 and the second commit time is 20180924070031</p>
@@ -1295,8 +1285,8 @@ and the second commit time is 20180924070031</p>
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">2</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
 <span class="n">beeline</span> <span class="o">-</span><span class="n">u</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000 --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat --hiveconf hive.stats.autogather=false</span>
 
-<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">Query</span>
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 'GOOG';</span>
+<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">View</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor group by symbol HAVING symbol = 'GOOG';</span>
 <span class="nl">WARNING:</span> <span class="nc">Hive</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="no">MR</span> <span class="n">is</span> <span class="n">deprecated</span> <span class="n">in</span> <span class="nc">Hive</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">may</span> <span class="n">not</span> <span class="n">be</span> <span class="n">available</span> <span class="n">in</span> <span class="n">the</spa [...]
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
@@ -1305,7 +1295,7 @@ and the second commit time is 20180924070031</p>
 <span class="o">+---------+----------------------+--+</span>
 <span class="mi">1</span> <span class="n">row</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">1.6</span> <span class="n">seconds</span><span class="o">)</span>
 
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol = 'GOOG';</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor where  symbol = 'GOOG';</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 <span class="o">|</span> <span class="n">_hoodie_commit_time</span>  <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>          <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span>  <span class="o">|</span>    <span class="n">open</span>    <span class="o">|</span>   <span class="n">close</span>   <span class="o">|</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
@@ -1313,7 +1303,7 @@ and the second commit time is 20180924070031</p>
 <span class="o">|</span> <span class="mi">20180924070031</span>       <span class="o">|</span> <span class="no">GOOG</span>    <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span>  <span class="o">|</span> <span class="mi">9021</span>    <span class="o">|</span> < [...]
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 
-<span class="err">#</span> <span class="nc">Snapshot</span> <span class="nc">Query</span>
+<span class="err">#</span> <span class="nc">Realtime</span> <span class="nc">View</span>
 <span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select symbol, max(ts) from stock_ticks_mor_rt group by symbol HAVING symbol = 'GOOG';</span>
 <span class="nl">WARNING:</span> <span class="nc">Hive</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="no">MR</span> <span class="n">is</span> <span class="n">deprecated</span> <span class="n">in</span> <span class="nc">Hive</span> <span class="mi">2</span> <span class="n">and</span> <span class="n">may</span> <span class="n">not</span> <span class="n">be</span> <span class="n">available</span> <span class="n">in</span> <span class="n">the</spa [...]
 <span class="o">+---------+----------------------+--+</span>
@@ -1330,7 +1320,7 @@ and the second commit time is 20180924070031</p>
 <span class="o">|</span> <span class="mi">20180924070031</span>       <span class="o">|</span> <span class="no">GOOG</span>    <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span>  <span class="o">|</span> <span class="mi">9021</span>    <span class="o">|</span> < [...]
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 
-<span class="err">#</span> <span class="nc">Incremental</span> <span class="nl">Query:</span>
+<span class="err">#</span> <span class="nc">Incremental</span> <span class="nl">View:</span>
 
 <span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; set hoodie.stock_ticks_mor.consume.mode=INCREMENTAL;</span>
 <span class="nc">No</span> <span class="n">rows</span> <span class="nf">affected</span> <span class="o">(</span><span class="mf">0.008</span> <span class="n">seconds</span><span class="o">)</span>
@@ -1340,7 +1330,7 @@ and the second commit time is 20180924070031</p>
 <span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; set hoodie.stock_ticks_mor.consume.start.timestamp=20180924064636;</span>
 <span class="nc">No</span> <span class="n">rows</span> <span class="nf">affected</span> <span class="o">(</span><span class="mf">0.013</span> <span class="n">seconds</span><span class="o">)</span>
 <span class="err">#</span> <span class="nl">Query:</span>
-<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol = 'GOOG' and `_hoodie_commit_time` &gt; '20180924064636';</span>
+<span class="mi">0</span><span class="o">:</span> <span class="nl">jdbc:hive2:</span><span class="c1">//hiveserver:10000&gt; select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor where  symbol = 'GOOG' and `_hoodie_commit_time` &gt; '20180924064636';</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 <span class="o">|</span> <span class="n">_hoodie_commit_time</span>  <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>          <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span>  <span class="o">|</span>    <span class="n">open</span>    <span class="o">|</span>   <span class="n">close</span>   <span class="o">|</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
@@ -1350,13 +1340,13 @@ and the second commit time is 20180924070031</p>
 <span class="n">exit</span>
 </code></pre></div></div>
 
-<h3 id="step-10-read-optimized-and-snapshot-queries-for-mor-with-spark-sql-after-compaction">Step 10: Read Optimized and Snapshot queries for MOR with Spark-SQL after compaction</h3>
+<h3 id="step-10-read-optimized-and-realtime-views-for-mor-with-spark-sql-after-compaction">Step 10: Read Optimized and Realtime Views for MOR with Spark-SQL after compaction</h3>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">adhoc</span><span class="o">-</span><span class="mi">1</span> <span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">bash</span>
-<span class="n">bash</span><span class="o">-</span><span class="mf">4.4</span><span class="err">#</span> <span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">driver</span><span class="o">-</span><span class="kd">class [...]
+<span class="n">bash</span><span class="o">-</span><span class="mf">4.4</span><span class="err">#</span> <span class="n">$SPARK_INSTALL</span><span class="o">/</span><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">jars</span> <span class="n">$HUDI_SPARK_BUNDLE</span> <span class="o">--</span><span class="n">driver</span><span class="o">-</span><span class="kd">class [...]
 
-<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">Query</span>
-<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
+<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">View</span>
+<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select symbol, max(ts) from stock_ticks_mor group by symbol HAVING symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
 <span class="o">+---------+----------------------+--+</span>
@@ -1364,7 +1354,7 @@ and the second commit time is 20180924070031</p>
 <span class="o">+---------+----------------------+--+</span>
 <span class="mi">1</span> <span class="n">row</span> <span class="nf">selected</span> <span class="o">(</span><span class="mf">1.6</span> <span class="n">seconds</span><span class="o">)</span>
 
-<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
+<span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select `_hoodie_commit_time`, symbol, ts, volume, open, close  from stock_ticks_mor where  symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 <span class="o">|</span> <span class="n">_hoodie_commit_time</span>  <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>          <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span>  <span class="o">|</span>    <span class="n">open</span>    <span class="o">|</span>   <span class="n">close</span>   <span class="o">|</span>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
@@ -1372,7 +1362,7 @@ and the second commit time is 20180924070031</p>
 <span class="o">|</span> <span class="mi">20180924070031</span>       <span class="o">|</span> <span class="no">GOOG</span>    <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span>  <span class="o">|</span> <span class="mi">9021</span>    <span class="o">|</span> < [...]
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 
-<span class="err">#</span> <span class="nc">Snapshot</span> <span class="nc">Query</span>
+<span class="err">#</span> <span class="nc">Realtime</span> <span class="nc">View</span>
 <span class="n">scala</span><span class="o">&gt;</span> <span class="n">spark</span><span class="o">.</span><span class="na">sql</span><span class="o">(</span><span class="s">"select symbol, max(ts) from stock_ticks_mor_rt group by symbol HAVING symbol = 'GOOG'"</span><span class="o">).</span><span class="na">show</span><span class="o">(</span><span class="mi">100</span><span class="o">,</span> <span class="kc">false</span><span class="o">)</span>
 <span class="o">+---------+----------------------+--+</span>
 <span class="o">|</span> <span class="n">symbol</span>  <span class="o">|</span>         <span class="n">_c1</span>          <span class="o">|</span>
@@ -1389,14 +1379,14 @@ and the second commit time is 20180924070031</p>
 <span class="o">+----------------------+---------+----------------------+---------+------------+-----------+--+</span>
 </code></pre></div></div>
 
-<h3 id="step-11--presto-read-optimized-queries-on-mor-table-after-compaction">Step 11:  Presto Read Optimized queries on MOR table after compaction</h3>
+<h3 id="step-11--presto-queries-over-read-optimized-view-on-mor-dataset-after-compaction">Step 11:  Presto queries over Read Optimized View on MOR dataset after compaction</h3>
 
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">docker</span> <span class="n">exec</span> <span class="o">-</span><span class="n">it</span> <span class="n">presto</span><span class="o">-</span><span class="n">worker</span><span class="o">-</span><span class="mi">1</span> <span class="n">presto</span> <span class="o">--</span><span class="n">server</span> <span class="n">presto</span><span class="o">-</span><span class="n">c [...]
 <span class="n">presto</span><span class="o">&gt;</span> <span class="n">use</span> <span class="n">hive</span><span class="o">.</span><span class="na">default</span><span class="o">;</span>
 <span class="no">USE</span>
 
-<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">Query</span>
-<span class="nl">resto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor_ro</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <spa [...]
+<span class="err">#</span> <span class="nc">Read</span> <span class="nc">Optimized</span> <span class="nc">View</span>
+<span class="nl">resto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">max</span><span class="o">(</span><span class="n">ts</span><span class="o">)</span> <span class="n">from</span> <span class="n">stock_ticks_mor</span> <span class="n">group</span> <span class="n">by</span> <span class="n">symbol</span> <span class="no">HAVING</span> <span class="n">symbol</span> <span c [...]
   <span class="n">symbol</span> <span class="o">|</span>        <span class="n">_col1</span>
 <span class="o">--------+---------------------</span>
  <span class="no">GOOG</span>   <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">10</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span>
@@ -1406,7 +1396,7 @@ and the second commit time is 20180924070031</p>
 <span class="nl">Splits:</span> <span class="mi">49</span> <span class="n">total</span><span class="o">,</span> <span class="mi">49</span> <span class="n">done</span> <span class="o">(</span><span class="mf">100.00</span><span class="o">%)</span>
 <span class="mi">0</span><span class="o">:</span><span class="mo">01</span> <span class="o">[</span><span class="mi">197</span> <span class="n">rows</span><span class="o">,</span> <span class="mi">613</span><span class="no">B</span><span class="o">]</span> <span class="o">[</span><span class="mi">133</span> <span class="n">rows</span><span class="o">/</span><span class="n">s</span><span class="o">,</span> <span class="mi">414</span><span class="no">B</span><span class="o">/</span><span c [...]
 
-<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="s">"_hoodie_commit_time"</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor_ro</ [...]
+<span class="nl">presto:</span><span class="k">default</span><span class="o">&gt;</span> <span class="n">select</span> <span class="s">"_hoodie_commit_time"</span><span class="o">,</span> <span class="n">symbol</span><span class="o">,</span> <span class="n">ts</span><span class="o">,</span> <span class="n">volume</span><span class="o">,</span> <span class="n">open</span><span class="o">,</span> <span class="n">close</span>  <span class="n">from</span> <span class="n">stock_ticks_mor</spa [...]
  <span class="n">_hoodie_commit_time</span> <span class="o">|</span> <span class="n">symbol</span> <span class="o">|</span>         <span class="n">ts</span>          <span class="o">|</span> <span class="n">volume</span> <span class="o">|</span>   <span class="n">open</span>    <span class="o">|</span>  <span class="n">close</span>
 <span class="o">---------------------+--------+---------------------+--------+-----------+----------</span>
  <span class="mi">20190822180250</span>      <span class="o">|</span> <span class="no">GOOG</span>   <span class="o">|</span> <span class="mi">2018</span><span class="o">-</span><span class="mi">08</span><span class="o">-</span><span class="mi">31</span> <span class="mi">09</span><span class="o">:</span><span class="mi">59</span><span class="o">:</span><span class="mo">00</span> <span class="o">|</span>   <span class="mi">6330</span> <span class="o">|</span>    <span class="mf">1230.5</s [...]
@@ -1430,7 +1420,7 @@ and the second commit time is 20180924070031</p>
 </code></pre></div></div>
 <p>The above command builds docker images for all the services with
 current Hudi source installed at /var/hoodie/ws and also brings up the services using a compose file. We
-currently use Hadoop (v2.8.4), Hive (v2.3.3) and Spark (v2.4.4) in docker images.</p>
+currently use Hadoop (v2.8.4), Hive (v2.3.3) and Spark (v2.3.1) in docker images.</p>
 
 <p>To bring down the containers</p>
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">$</span> <span class="n">cd</span> <span class="n">hudi</span><span class="o">-</span><span class="n">integ</span><span class="o">-</span><span class="n">test</span>
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-docs-versions.html
similarity index 85%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-docs-versions.html
index da9849f..bddc200 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-docs-versions.html
@@ -4,16 +4,16 @@
     <meta charset="utf-8">
 
 <!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<meta name="description" content="                              Latest            英文版            中文版                                  0.5.1            英文版            中文版                                  0.5.0            英文版            中文版                  ">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
 <meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-docs-versions.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.1            英文版            中文版                                  0.5.0            英文版            中文版                  ">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="active">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -347,6 +347,12 @@
         </tr>
       
         <tr>
+            <th>0.5.1</th>
+            <td><a href="/docs/0.5.1-quick-start-guide.html">英文版</a></td>
+            <td><a href="/cn/docs/0.5.1-quick-start-guide.html">中文版</a></td>
+        </tr>
+      
+        <tr>
             <th>0.5.0</th>
             <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
             <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
@@ -355,6 +361,7 @@
     </tbody>
 </table>
 
+
       </section>
 
       <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-gcs_hoodie.html
similarity index 61%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-gcs_hoodie.html
index da9849f..df552f0 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-gcs_hoodie.html
@@ -3,17 +3,17 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<!-- begin _includes/seo.html --><title>GCS Filesystem - Apache Hudi</title>
+<meta name="description" content="For Hudi storage on GCS, regional buckets provide an DFS API with strong consistency.">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:title" content="GCS Filesystem">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-gcs_hoodie.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="For Hudi storage on GCS, regional buckets provide an DFS API with strong consistency.">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -324,7 +324,7 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">文档版本
+          <h1 id="page-title" class="page__title" itemprop="headline">GCS Filesystem
 </h1>
         </header>
       
@@ -337,23 +337,63 @@
             }
           </style>
         
-        <table class="docversions">
-    <tbody>
-      
-        <tr>
-            <th>Latest</th>
-            <td><a href="/docs/quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-        <tr>
-            <th>0.5.0</th>
-            <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-    </tbody>
-</table>
+        <p>For Hudi storage on GCS, <strong>regional</strong> buckets provide an DFS API with strong consistency.</p>
+
+<h2 id="gcs-configs">GCS Configs</h2>
+
+<p>There are two configurations required for Hudi GCS compatibility:</p>
+
+<ul>
+  <li>Adding GCS Credentials for Hudi</li>
+  <li>Adding required jars to classpath</li>
+</ul>
+
+<h3 id="gcs-credentials">GCS Credentials</h3>
+
+<p>Add the required configs in your core-site.xml from where Hudi can fetch them. Replace the <code class="highlighter-rouge">fs.defaultFS</code> with your GCS bucket name and Hudi should be able to read/write from the bucket.</p>
+
+<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>fs.defaultFS<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>gs://hudi-bucket<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>fs.gs.impl<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem<span class="nt">&lt;/value&gt;</span>
+    <span class="nt">&lt;description&gt;</span>The FileSystem for gs: (GCS) uris.<span class="nt">&lt;/description&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>fs.AbstractFileSystem.gs.impl<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS<span class="nt">&lt;/value&gt;</span>
+    <span class="nt">&lt;description&gt;</span>The AbstractFileSystem for gs: (GCS) uris.<span class="nt">&lt;/description&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>fs.gs.project.id<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>GCS_PROJECT_ID<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>google.cloud.auth.service.account.enable<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>true<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>google.cloud.auth.service.account.email<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>GCS_SERVICE_ACCOUNT_EMAIL<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+  <span class="nt">&lt;property&gt;</span>
+    <span class="nt">&lt;name&gt;</span>google.cloud.auth.service.account.keyfile<span class="nt">&lt;/name&gt;</span>
+    <span class="nt">&lt;value&gt;</span>GCS_SERVICE_ACCOUNT_KEYFILE<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+</code></pre></div></div>
+
+<h3 id="gcs-libs">GCS Libs</h3>
+
+<p>GCS hadoop libraries to add to our classpath</p>
+
+<ul>
+  <li>com.google.cloud.bigdataoss:gcs-connector:1.6.0-hadoop2</li>
+</ul>
 
       </section>
 
diff --git a/content/cn/docs/0.5.1-migration_guide.html b/content/cn/docs/0.5.1-migration_guide.html
new file mode 100644
index 0000000..be57245
--- /dev/null
+++ b/content/cn/docs/0.5.1-migration_guide.html
@@ -0,0 +1,444 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>Migration Guide - Apache Hudi</title>
+<meta name="description" content="Hudi maintains metadata such as commit timeline and indexes to manage a dataset. The commit timelines helps to understand the actions happening on a dataset as well as the current state of a dataset. Indexes are used by Hudi to maintain a record key to file id mapping to efficiently locate a record. At the moment, Hudi supports writing only parquet columnar formats.To be able to start using Hudi for your existing dataset, you will need to migrate your ex [...]
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="Migration Guide">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-migration_guide.html">
+
+
+  <meta property="og:description" content="Hudi maintains metadata such as commit timeline and indexes to manage a dataset. The commit timelines helps to understand the actions happening on a dataset as well as the current state of a dataset. Indexes are used by Hudi to maintain a record key to file id mapping to efficiently locate a record. At the moment, Hudi supports writing only parquet columnar formats.To be able to start using Hudi for your existing dataset, you will need to migrat [...]
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">Migration Guide
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+          <style>
+            .page {
+              padding-right: 0 !important;
+            }
+          </style>
+        
+        <p>Hudi maintains metadata such as commit timeline and indexes to manage a dataset. The commit timelines helps to understand the actions happening on a dataset as well as the current state of a dataset. Indexes are used by Hudi to maintain a record key to file id mapping to efficiently locate a record. At the moment, Hudi supports writing only parquet columnar formats.
+To be able to start using Hudi for your existing dataset, you will need to migrate your existing dataset into a Hudi managed dataset. There are a couple of ways to achieve this.</p>
+
+<h2 id="approaches">Approaches</h2>
+
+<h3 id="use-hudi-for-new-partitions-alone">Use Hudi for new partitions alone</h3>
+
+<p>Hudi can be used to manage an existing dataset without affecting/altering the historical data already present in the
+dataset. Hudi has been implemented to be compatible with such a mixed dataset with a caveat that either the complete
+Hive partition is Hudi managed or not. Thus the lowest granularity at which Hudi manages a dataset is a Hive
+partition. Start using the datasource API or the WriteClient to write to the dataset and make sure you start writing
+to a new partition or convert your last N partitions into Hudi instead of the entire table. Note, since the historical
+ partitions are not managed by HUDI, none of the primitives provided by HUDI work on the data in those partitions. More concretely, one cannot perform upserts or incremental pull on such older partitions not managed by the HUDI dataset.
+Take this approach if your dataset is an append only type of dataset and you do not expect to perform any updates to existing (or non Hudi managed) partitions.</p>
+
+<h3 id="convert-existing-dataset-to-hudi">Convert existing dataset to Hudi</h3>
+
+<p>Import your existing dataset into a Hudi managed dataset. Since all the data is Hudi managed, none of the limitations
+ of Approach 1 apply here. Updates spanning any partitions can be applied to this dataset and Hudi will efficiently
+ make the update available to queries. Note that not only do you get to use all Hudi primitives on this dataset,
+ there are other additional advantages of doing this. Hudi automatically manages file sizes of a Hudi managed dataset
+ . You can define the desired file size when converting this dataset and Hudi will ensure it writes out files
+ adhering to the config. It will also ensure that smaller files later get corrected by routing some new inserts into
+ small files rather than writing new small ones thus maintaining the health of your cluster.</p>
+
+<p>There are a few options when choosing this approach.</p>
+
+<p><strong>Option 1</strong>
+Use the HDFSParquetImporter tool. As the name suggests, this only works if your existing dataset is in parquet file format.
+This tool essentially starts a Spark Job to read the existing parquet dataset and converts it into a HUDI managed dataset by re-writing all the data.</p>
+
+<p><strong>Option 2</strong>
+For huge datasets, this could be as simple as :</p>
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">partition</span> <span class="n">in</span> <span class="o">[</span><span class="n">list</span> <span class="n">of</span> <span class="n">partitions</span> <span class="n">in</span> <span class="n">source</span> <span class="n">dataset</span><span class="o">]</span> <span class="o">{</span>
+        <span class="n">val</span> <span class="n">inputDF</span> <span class="o">=</span> <span class="n">spark</span><span class="o">.</span><span class="na">read</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"any_input_format"</span><span class="o">).</span><span class="na">load</span><span class="o">(</span><span class="s">"partition_path"</span><span class="o">)</span>
+        <span class="n">inputDF</span><span class="o">.</span><span class="na">write</span><span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span><span class="na">option</span><span class="o">()....</span><span class="na">save</span><span class="o">(</span><span class="s">"basePath"</span><span class="o">)</span>
+<span class="o">}</span>
+</code></pre></div></div>
+
+<p><strong>Option 3</strong>
+Write your own custom logic of how to load an existing dataset into a Hudi managed one. Please read about the RDD API
+ <a href="/cn/docs/0.5.1-quick-start-guide.html">here</a>. Using the HDFSParquetImporter Tool. Once hudi has been built via <code class="highlighter-rouge">mvn clean install -DskipTests</code>, the shell can be
+fired by via <code class="highlighter-rouge">cd hudi-cli &amp;&amp; ./hudi-cli.sh</code>.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">hudi</span><span class="o">-&gt;</span><span class="n">hdfsparquetimport</span>
+        <span class="o">--</span><span class="n">upsert</span> <span class="kc">false</span>
+        <span class="o">--</span><span class="n">srcPath</span> <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">parquet</span><span class="o">/</span><span class="n">dataset</span><span class="o">/</span><span class="n">basepath</span>
+        <span class="o">--</span><span class="n">targetPath</span>
+        <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">hoodie</span><span class="o">/</span><span class="n">dataset</span><span class="o">/</span><span class="n">basepath</span>
+        <span class="o">--</span><span class="n">tableName</span> <span class="n">hoodie_table</span>
+        <span class="o">--</span><span class="n">tableType</span> <span class="no">COPY_ON_WRITE</span>
+        <span class="o">--</span><span class="n">rowKeyField</span> <span class="n">_row_key</span>
+        <span class="o">--</span><span class="n">partitionPathField</span> <span class="n">partitionStr</span>
+        <span class="o">--</span><span class="n">parallelism</span> <span class="mi">1500</span>
+        <span class="o">--</span><span class="n">schemaFilePath</span> <span class="o">/</span><span class="n">user</span><span class="o">/</span><span class="n">table</span><span class="o">/</span><span class="n">schema</span>
+        <span class="o">--</span><span class="n">format</span> <span class="n">parquet</span>
+        <span class="o">--</span><span class="n">sparkMemory</span> <span class="mi">6</span><span class="n">g</span>
+        <span class="o">--</span><span class="n">retry</span> <span class="mi">2</span>
+</code></pre></div></div>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-performance.html
similarity index 60%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-performance.html
index da9849f..370ef6b 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-performance.html
@@ -3,17 +3,17 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<!-- begin _includes/seo.html --><title>性能 - Apache Hudi</title>
+<meta name="description" content="在本节中,我们将介绍一些有关Hudi插入更新、增量提取的实际性能数据,并将其与实现这些任务的其它传统工具进行比较。">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:title" content="性能">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-performance.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="在本节中,我们将介绍一些有关Hudi插入更新、增量提取的实际性能数据,并将其与实现这些任务的其它传统工具进行比较。">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="active">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -324,7 +324,7 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">文档版本
+          <h1 id="page-title" class="page__title" itemprop="headline">性能
 </h1>
         </header>
       
@@ -337,23 +337,60 @@
             }
           </style>
         
-        <table class="docversions">
-    <tbody>
-      
-        <tr>
-            <th>Latest</th>
-            <td><a href="/docs/quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-        <tr>
-            <th>0.5.0</th>
-            <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-    </tbody>
-</table>
+        <p>在本节中,我们将介绍一些有关Hudi插入更新、增量提取的实际性能数据,并将其与实现这些任务的其它传统工具进行比较。</p>
+
+<h2 id="插入更新">插入更新</h2>
+
+<p>下面显示了从NoSQL数据库摄取获得的速度提升,这些速度提升数据是通过在写入时复制存储上的Hudi数据集上插入更新而获得的,
+数据集包括5个从小到大的表(相对于批量加载表)。</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_upsert_perf1.png" alt="hudi_upsert_perf1.png" style="max-width: 1000px" />
+</figure>
+
+<p>由于Hudi可以通过增量构建数据集,它也为更频繁地调度摄取提供了可能性,从而减少了延迟,并显著节省了总体计算成本。</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_upsert_perf2.png" alt="hudi_upsert_perf2.png" style="max-width: 1000px" />
+</figure>
+
+<p>Hudi插入更新在t1表的一次提交中就进行了高达4TB的压力测试。
+有关一些调优技巧,请参见<a href="https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide">这里</a>。</p>
+
+<h2 id="索引">索引</h2>
+
+<p>为了有效地插入更新数据,Hudi需要将要写入的批量数据中的记录分类为插入和更新(并标记它所属的文件组)。
+为了加快此操作的速度,Hudi采用了可插拔索引机制,该机制存储了recordKey和它所属的文件组ID之间的映射。
+默认情况下,Hudi使用内置索引,该索引使用文件范围和布隆过滤器来完成此任务,相比于Spark Join,其速度最高可提高10倍。</p>
+
+<p>当您将recordKey建模为单调递增时(例如时间戳前缀),Hudi提供了最佳的索引性能,从而进行范围过滤来避免与许多文件进行比较。
+即使对于基于UUID的键,也有<a href="https://www.percona.com/blog/2014/12/19/store-uuid-optimized-way/">已知技术</a>来达到同样目的。
+例如,在具有80B键、3个分区、11416个文件、10TB数据的事件表上使用100M个时间戳前缀的键(5%的更新,95%的插入)时,
+相比于原始Spark Join,Hudi索引速度的提升<strong>约为7倍(440秒相比于2880秒)</strong>。
+即使对于具有挑战性的工作负载,如使用300个核对3.25B UUID键、30个分区、6180个文件的“100%更新”的数据库摄取工作负载,Hudi索引也可以提供<strong>80-100%的加速</strong>。</p>
+
+<h2 id="读优化查询">读优化查询</h2>
+
+<p>读优化视图的主要设计目标是在不影响查询的情况下实现上一节中提到的延迟减少和效率提高。
+下图比较了对Hudi和非Hudi数据集的Hive、Presto、Spark查询,并对此进行说明。</p>
+
+<p><strong>Hive</strong></p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_query_perf_hive.png" alt="hudi_query_perf_hive.png" style="max-width: 800px" />
+</figure>
+
+<p><strong>Spark</strong></p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_query_perf_spark.png" alt="hudi_query_perf_spark.png" style="max-width: 1000px" />
+</figure>
+
+<p><strong>Presto</strong></p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_query_perf_presto.png" alt="hudi_query_perf_presto.png" style="max-width: 1000px" />
+</figure>
 
       </section>
 
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-powered_by.html
similarity index 55%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-powered_by.html
index da9849f..9068bbd 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-powered_by.html
@@ -3,23 +3,23 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<!-- begin _includes/seo.html --><title>演讲 &amp; Hudi 用户 - Apache Hudi</title>
+<meta name="description" content="已使用">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:title" content="演讲 &amp; Hudi 用户">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-powered_by.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="已使用">
 
 
 
 
 
-  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+  <meta property="article:modified_time" content="2019-12-31T14:59:57-05:00">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="active">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -324,7 +324,7 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">文档版本
+          <h1 id="page-title" class="page__title" itemprop="headline">演讲 &amp; Hudi 用户
 </h1>
         </header>
       
@@ -337,23 +337,68 @@
             }
           </style>
         
-        <table class="docversions">
-    <tbody>
-      
-        <tr>
-            <th>Latest</th>
-            <td><a href="/docs/quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-        <tr>
-            <th>0.5.0</th>
-            <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-    </tbody>
-</table>
+        <h2 id="已使用">已使用</h2>
+
+<h3 id="uber">Uber</h3>
+
+<p>Hudi最初由<a href="https://uber.com">Uber</a>开发,用于实现<a href="http://www.slideshare.net/vinothchandar/hadoop-strata-talk-uber-your-hadoop-has-arrived/32">低延迟、高效率的数据库摄取</a>。
+Hudi自2016年8月开始在生产环境上线,在Hadoop上驱动约100个非常关键的业务表,支撑约几百TB的数据规模(前10名包括行程、乘客、司机)。
+Hudi还支持几个增量的Hive ETL管道,并且目前已集成到Uber的数据分发系统中。</p>
+
+<h3 id="emis-health">EMIS Health</h3>
+
+<p><a href="https://www.emishealth.com/">EMIS Health</a>是英国最大的初级保健IT软件提供商,其数据集包括超过5000亿的医疗保健记录。HUDI用于管理生产中的分析数据集,并使其与上游源保持同步。Presto用于查询以HUDI格式写入的数据。</p>
+
+<h3 id="yieldsio">Yields.io</h3>
+
+<p>Yields.io是第一个使用AI在企业范围内进行自动模型验证和实时监控的金融科技平台。他们的数据湖由Hudi管理,他们还积极使用Hudi为增量式、跨语言/平台机器学习构建基础架构。</p>
+
+<h3 id="yotpo">Yotpo</h3>
+
+<p>Hudi在Yotpo有不少用途。首先,在他们的<a href="https://github.com/YotpoLtd/metorikku">开源ETL框架</a>中集成了Hudi作为CDC管道的输出写入程序,即从数据库binlog生成的事件流到Kafka然后再写入S3。</p>
+
+<h2 id="演讲--报告">演讲 &amp; 报告</h2>
+
+<ol>
+  <li>
+    <p><a href="https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/56511">“Hoodie: Incremental processing on Hadoop at Uber”</a> -  By Vinoth Chandar &amp; Prasanna Rajaperumal
+Mar 2017, Strata + Hadoop World, San Jose, CA</p>
+  </li>
+  <li>
+    <p><a href="http://www.dataengconf.com/hoodie-an-open-source-incremental-processing-framework-from-uber">“Hoodie: An Open Source Incremental Processing Framework From Uber”</a> - By Vinoth Chandar.
+Apr 2017, DataEngConf, San Francisco, CA <a href="https://www.slideshare.net/vinothchandar/hoodie-dataengconf-2017">Slides</a> <a href="https://www.youtube.com/watch?v=7Wudjc-v7CA">Video</a></p>
+  </li>
+  <li>
+    <p><a href="https://spark-summit.org/2017/events/incremental-processing-on-large-analytical-datasets/">“Incremental Processing on Large Analytical Datasets”</a> - By Prasanna Rajaperumal
+June 2017, Spark Summit 2017, San Francisco, CA. <a href="https://www.slideshare.net/databricks/incremental-processing-on-large-analytical-datasets-with-prasanna-rajaperumal-and-vinoth-chandar">Slides</a> <a href="https://www.youtube.com/watch?v=3HS0lQX-cgo&amp;feature=youtu.be">Video</a></p>
+  </li>
+  <li>
+    <p><a href="https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/70937">“Hudi: Unifying storage and serving for batch and near-real-time analytics”</a> - By Nishith Agarwal &amp; Balaji Vardarajan
+September 2018, Strata Data Conference, New York, NY</p>
+  </li>
+  <li>
+    <p><a href="https://databricks.com/session/hudi-near-real-time-spark-pipelines-at-petabyte-scale">“Hudi: Large-Scale, Near Real-Time Pipelines at Uber”</a> - By Vinoth Chandar &amp; Nishith Agarwal
+October 2018, Spark+AI Summit Europe, London, UK</p>
+  </li>
+  <li>
+    <p><a href="https://www.youtube.com/watch?v=1w3IpavhSWA">“Powering Uber’s global network analytics pipelines in real-time with Apache Hudi”</a> - By Ethan Guo &amp; Nishith Agarwal, April 2019, Data Council SF19, San Francisco, CA.</p>
+  </li>
+  <li>
+    <p><a href="https://www.slideshare.net/ChesterChen/sf-big-analytics-20190612-building-highly-efficient-data-lakes-using-apache-hudi">“Building highly efficient data lakes using Apache Hudi (Incubating)”</a> - By Vinoth Chandar 
+June 2019, SF Big Analytics Meetup, San Mateo, CA</p>
+  </li>
+  <li>
+    <p><a href="https://docs.google.com/presentation/d/1FHhsvh70ZP6xXlHdVsAI0g__B_6Mpto5KQFlZ0b8-mM">“Apache Hudi (Incubating) - The Past, Present and Future Of Efficient Data Lake Architectures”</a> - By Vinoth Chandar &amp; Balaji Varadarajan
+September 2019, ApacheCon NA 19, Las Vegas, NV, USA</p>
+  </li>
+</ol>
+
+<h2 id="文章">文章</h2>
+
+<ol>
+  <li><a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">“The Case for incremental processing on Hadoop”</a> - O’reilly Ideas article by Vinoth Chandar</li>
+  <li><a href="https://eng.uber.com/hoodie/">“Hoodie: Uber Engineering’s Incremental Processing Framework on Hadoop”</a> - Engineering Blog By Prasanna Rajaperumal</li>
+</ol>
 
       </section>
 
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-privacy.html
similarity index 71%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-privacy.html
index da9849f..0411a53 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-privacy.html
@@ -3,17 +3,17 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<!-- begin _includes/seo.html --><title>Privacy Policy - Apache Hudi</title>
+<meta name="description" content="Information about your use of this website is collected using server access logs and a tracking cookie.The collected information consists of the following:">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:title" content="Privacy Policy">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-privacy.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="Information about your use of this website is collected using server access logs and a tracking cookie.The collected information consists of the following:">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="active">版权信息</a></li>
             
 
           
@@ -324,7 +324,7 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">文档版本
+          <h1 id="page-title" class="page__title" itemprop="headline">Privacy Policy
 </h1>
         </header>
       
@@ -337,23 +337,24 @@
             }
           </style>
         
-        <table class="docversions">
-    <tbody>
-      
-        <tr>
-            <th>Latest</th>
-            <td><a href="/docs/quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-        <tr>
-            <th>0.5.0</th>
-            <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-    </tbody>
-</table>
+        <p>Information about your use of this website is collected using server access logs and a tracking cookie.
+The collected information consists of the following:</p>
+
+<ul>
+  <li>The IP address from which you access the website;</li>
+  <li>The type of browser and operating system you use to access our site;</li>
+  <li>The date and time you access our site;</li>
+  <li>The pages you visit;</li>
+  <li>The addresses of pages from where you followed a link to our site.</li>
+</ul>
+
+<p>Part of this information is gathered using a tracking cookie set by the <a href="http://www.google.com/analytics">Google Analytics</a> service and handled by Google as described in their <a href="http://www.google.com/privacy.html">privacy policy</a>. See your browser documentation for instructions on how to disable the cookie if you prefer not to share this data with Google.</p>
+
+<p>We use the gathered information to help us make our site more useful to visitors and to better understand how and when our site is used. We do not track or collect personally identifiable information or associate gathered data with any personally identifying information from other sources.</p>
+
+<p>By using this website, you consent to the collection of this data in the manner and for the purpose described above.</p>
+
+<p>The Hudi development community welcomes your questions or comments regarding this Privacy Policy. Send them to dev@hudi.apache.org</p>
 
       </section>
 
diff --git a/content/cn/docs/querying_data.html b/content/cn/docs/0.5.1-querying_data.html
similarity index 87%
copy from content/cn/docs/querying_data.html
copy to content/cn/docs/0.5.1-querying_data.html
index b9b9147..e74523f 100644
--- a/content/cn/docs/querying_data.html
+++ b/content/cn/docs/0.5.1-querying_data.html
@@ -10,7 +10,7 @@
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
 <meta property="og:title" content="查询 Hudi 数据集">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/querying_data.html">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-querying_data.html">
 
 
   <meta property="og:description" content="从概念上讲,Hudi物理存储一次数据到DFS上,同时在其上提供三个逻辑视图,如之前所述。数据集同步到Hive Metastore后,它将提供由Hudi的自定义输入格式支持的Hive外部表。一旦提供了适当的Hudi捆绑包,就可以通过Hive、Spark和Presto之类的常用查询引擎来查询数据集。">
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/querying_data.html" class="active">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="active">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/deployment.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/docs-versions.html" class="">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -350,15 +350,20 @@
     </ul>
   </li>
   <li><a href="#presto">Presto</a></li>
+  <li><a href="#impala此功能还未正式发布">Impala(此功能还未正式发布)</a>
+    <ul>
+      <li><a href="#读优化表">读优化表</a></li>
+    </ul>
+  </li>
 </ul>
           </nav>
         </aside>
         
-        <p>从概念上讲,Hudi物理存储一次数据到DFS上,同时在其上提供三个逻辑视图,如<a href="/cn/docs/concepts.html#views">之前</a>所述。
+        <p>从概念上讲,Hudi物理存储一次数据到DFS上,同时在其上提供三个逻辑视图,如<a href="/cn/docs/0.5.1-concepts.html#views">之前</a>所述。
 数据集同步到Hive Metastore后,它将提供由Hudi的自定义输入格式支持的Hive外部表。一旦提供了适当的Hudi捆绑包,
 就可以通过Hive、Spark和Presto之类的常用查询引擎来查询数据集。</p>
 
-<p>具体来说,在写入过程中传递了两个由<a href="/cn/docs/configurations.html#TABLE_NAME_OPT_KEY">table name</a>命名的Hive表。
+<p>具体来说,在写入过程中传递了两个由<a href="/cn/docs/0.5.1-configurations.html#TABLE_NAME_OPT_KEY">table name</a>命名的Hive表。
 例如,如果<code class="highlighter-rouge">table name = hudi_tbl</code>,我们得到</p>
 
 <ul>
@@ -369,7 +374,7 @@
 <p>如概念部分所述,<a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">增量处理</a>所需要的
 一个关键原语是<code class="highlighter-rouge">增量拉取</code>(以从数据集中获取更改流/日志)。您可以增量提取Hudi数据集,这意味着自指定的即时时间起,
 您可以只获得全部更新和新行。 这与插入更新一起使用,对于构建某些数据管道尤其有用,包括将1个或多个源Hudi表(数据流/事实)以增量方式拉出(流/事实)
-并与其他表(数据集/维度)结合以<a href="/cn/docs/writing_data.html">写出增量</a>到目标Hudi数据集。增量视图是通过查询上表之一实现的,并具有特殊配置,
+并与其他表(数据集/维度)结合以<a href="/cn/docs/0.5.1-writing_data.html">写出增量</a>到目标Hudi数据集。增量视图是通过查询上表之一实现的,并具有特殊配置,
 该特殊配置指示查询计划仅需要从数据集中获取增量数据。</p>
 
 <p>接下来,我们将详细讨论在每个查询引擎上如何访问所有三个视图。</p>
@@ -540,7 +545,7 @@ Upsert实用程序(<code class="highlighter-rouge">HoodieDeltaStreamer</code>
      <span class="o">.</span><span class="na">load</span><span class="o">(</span><span class="n">tablePath</span><span class="o">);</span> <span class="c1">// For incremental view, pass in the root/base path of dataset</span>
 </code></pre></div></div>
 
-<p>请参阅<a href="/cn/docs/configurations.html#spark-datasource">设置</a>部分,以查看所有数据源选项。</p>
+<p>请参阅<a href="/cn/docs/0.5.1-configurations.html#spark-datasource">设置</a>部分,以查看所有数据源选项。</p>
 
 <p>另外,<code class="highlighter-rouge">HoodieReadClient</code>通过Hudi的隐式索引提供了以下功能。</p>
 
@@ -572,6 +577,33 @@ Upsert实用程序(<code class="highlighter-rouge">HoodieDeltaStreamer</code>
 <p>Presto是一种常用的查询引擎,可提供交互式查询性能。 Hudi RO表可以在Presto中无缝查询。
 这需要在整个安装过程中将<code class="highlighter-rouge">hudi-presto-bundle</code> jar放入<code class="highlighter-rouge">&lt;presto_install&gt;/plugin/hive-hadoop2/</code>中。</p>
 
+<h2 id="impala此功能还未正式发布">Impala(此功能还未正式发布)</h2>
+
+<h3 id="读优化表">读优化表</h3>
+
+<p>Impala可以在HDFS上查询Hudi读优化表,作为一种 <a href="https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_tables.html#external_tables">EXTERNAL TABLE</a> 的形式。<br />
+可以通过以下方式在Impala上建立Hudi读优化表:</p>
+<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE EXTERNAL TABLE database.table_name
+LIKE PARQUET '/path/to/load/xxx.parquet'
+STORED AS HUDIPARQUET
+LOCATION '/path/to/load';
+</code></pre></div></div>
+<p>Impala可以利用合理的文件分区来提高查询的效率。
+如果想要建立分区的表,文件夹命名需要根据此种方式<code class="highlighter-rouge">year=2020/month=1</code>.
+Impala使用<code class="highlighter-rouge">=</code>来区分分区名和分区值.<br />
+可以通过以下方式在Impala上建立分区Hudi读优化表:</p>
+<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE EXTERNAL TABLE database.table_name
+LIKE PARQUET '/path/to/load/xxx.parquet'
+PARTITION BY (year int, month int, day int)
+STORED AS HUDIPARQUET
+LOCATION '/path/to/load';
+ALTER TABLE database.table_name RECOVER PARTITIONS;
+</code></pre></div></div>
+<p>在Hudi成功写入一个新的提交后, 刷新Impala表来得到最新的结果.</p>
+<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REFRESH database.table_name
+</code></pre></div></div>
+
+
       </section>
 
       <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
diff --git a/content/cn/docs/0.5.1-quick-start-guide.html b/content/cn/docs/0.5.1-quick-start-guide.html
new file mode 100644
index 0000000..feed431
--- /dev/null
+++ b/content/cn/docs/0.5.1-quick-start-guide.html
@@ -0,0 +1,539 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>Quick-Start Guide - Apache Hudi</title>
+<meta name="description" content="本指南通过使用spark-shell简要介绍了Hudi功能。使用Spark数据源,我们将通过代码段展示如何插入和更新的Hudi默认存储类型数据集:写时复制。每次写操作之后,我们还将展示如何读取快照和增量读取数据。">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="Quick-Start Guide">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-quick-start-guide.html">
+
+
+  <meta property="og:description" content="本指南通过使用spark-shell简要介绍了Hudi功能。使用Spark数据源,我们将通过代码段展示如何插入和更新的Hudi默认存储类型数据集:写时复制。每次写操作之后,我们还将展示如何读取快照和增量读取数据。">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="active">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">Quick-Start Guide
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#设置spark-shell">设置spark-shell</a></li>
+  <li><a href="#inserts">插入数据</a></li>
+  <li><a href="#query">查询数据</a></li>
+  <li><a href="#updates">更新数据</a></li>
+  <li><a href="#增量查询">增量查询</a></li>
+  <li><a href="#特定时间点查询">特定时间点查询</a></li>
+  <li><a href="#从这开始下一步">从这开始下一步?</a></li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>本指南通过使用spark-shell简要介绍了Hudi功能。使用Spark数据源,我们将通过代码段展示如何插入和更新的Hudi默认存储类型数据集:
+<a href="/cn/docs/0.5.1-concepts.html#copy-on-write-storage">写时复制</a>。每次写操作之后,我们还将展示如何读取快照和增量读取数据。</p>
+
+<h2 id="设置spark-shell">设置spark-shell</h2>
+<p>Hudi适用于Spark-2.x版本。您可以按照<a href="https://spark.apache.org/downloads.html">此处</a>的说明设置spark。
+在提取的目录中,使用spark-shell运行Hudi:</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">bin</span><span class="o">/</span><span class="n">spark</span><span class="o">-</span><span class="n">shell</span> <span class="o">--</span><span class="n">packages</span> <span class="nv">org</span><span class="o">.</span><span class="py">apache</span><span class="o">.</span><span class="py">hudi</span><span class="k">:</span><span class="kt">hudi-spark-bundle:</span><span c [...]
+</code></pre></div></div>
+
+<p>设置表名、基本路径和数据生成器来为本指南生成记录。</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="nn">org.apache.hudi.QuickstartUtils._</span>
+<span class="k">import</span> <span class="nn">scala.collection.JavaConversions._</span>
+<span class="k">import</span> <span class="nn">org.apache.spark.sql.SaveMode._</span>
+<span class="k">import</span> <span class="nn">org.apache.hudi.DataSourceReadOptions._</span>
+<span class="k">import</span> <span class="nn">org.apache.hudi.DataSourceWriteOptions._</span>
+<span class="k">import</span> <span class="nn">org.apache.hudi.config.HoodieWriteConfig._</span>
+
+<span class="k">val</span> <span class="nv">tableName</span> <span class="k">=</span> <span class="s">"hudi_cow_table"</span>
+<span class="k">val</span> <span class="nv">basePath</span> <span class="k">=</span> <span class="s">"file:///tmp/hudi_cow_table"</span>
+<span class="k">val</span> <span class="nv">dataGen</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">DataGenerator</span>
+</code></pre></div></div>
+
+<p><a href="https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L50">数据生成器</a>
+可以基于<a href="https://github.com/apache/incubator-hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L57">行程样本模式</a>
+生成插入和更新的样本。</p>
+
+<h2 id="inserts">插入数据</h2>
+<p>生成一些新的行程样本,将其加载到DataFrame中,然后将DataFrame写入Hudi数据集中,如下所示。</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="nv">inserts</span> <span class="k">=</span> <span class="nf">convertToStringList</span><span class="o">(</span><span class="nv">dataGen</span><span class="o">.</span><span class="py">generateInserts</span><span class="o">(</span><span class="mi">10</span><span class="o">))</span>
+<span class="k">val</span> <span class="nv">df</span> <span class="k">=</span> <span class="nv">spark</span><span class="o">.</span><span class="py">read</span><span class="o">.</span><span class="py">json</span><span class="o">(</span><span class="nv">spark</span><span class="o">.</span><span class="py">sparkContext</span><span class="o">.</span><span class="py">parallelize</span><span class="o">(</span><span class="n">inserts</span><span class="o">,</span> <span class="mi">2</span><spa [...]
+<span class="nv">df</span><span class="o">.</span><span class="py">write</span><span class="o">.</span><span class="py">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span>
+    <span class="nf">options</span><span class="o">(</span><span class="n">getQuickstartWriteConfigs</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">PRECOMBINE_FIELD_OPT_KEY</span><span class="o">,</span> <span class="s">"ts"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">RECORDKEY_FIELD_OPT_KEY</span><span class="o">,</span> <span class="s">"uuid"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">PARTITIONPATH_FIELD_OPT_KEY</span><span class="o">,</span> <span class="s">"partitionpath"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">TABLE_NAME</span><span class="o">,</span> <span class="n">tableName</span><span class="o">).</span>
+    <span class="nf">mode</span><span class="o">(</span><span class="nc">Overwrite</span><span class="o">).</span>
+    <span class="nf">save</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+</code></pre></div></div>
+
+<p><code class="highlighter-rouge">mode(Overwrite)</code>覆盖并重新创建数据集(如果已经存在)。
+您可以检查在<code class="highlighter-rouge">/tmp/hudi_cow_table/&lt;region&gt;/&lt;country&gt;/&lt;city&gt;/</code>下生成的数据。我们提供了一个记录键
+(<a href="#sample-schema">schema</a>中的<code class="highlighter-rouge">uuid</code>),分区字段(<code class="highlighter-rouge">region/county/city</code>)和组合逻辑(<a href="#sample-schema">schema</a>中的<code class="highlighter-rouge">ts</code>)
+以确保行程记录在每个分区中都是唯一的。更多信息请参阅
+<a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#FAQ-HowdoImodelthedatastoredinHudi">对Hudi中的数据进行建模</a>,
+有关将数据提取到Hudi中的方法的信息,请参阅<a href="/cn/docs/0.5.1-writing_data.html">写入Hudi数据集</a>。
+这里我们使用默认的写操作:<code class="highlighter-rouge">插入更新</code>。 如果您的工作负载没有<code class="highlighter-rouge">更新</code>,也可以使用更快的<code class="highlighter-rouge">插入</code>或<code class="highlighter-rouge">批量插入</code>操作。
+想了解更多信息,请参阅<a href="/cn/docs/0.5.1-writing_data.html#write-operations">写操作</a></p>
+
+<h2 id="query">查询数据</h2>
+
+<p>将数据文件加载到DataFrame中。</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="nv">roViewDF</span> <span class="k">=</span> <span class="n">spark</span><span class="o">.</span>
+    <span class="n">read</span><span class="o">.</span>
+    <span class="nf">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span>
+    <span class="nf">load</span><span class="o">(</span><span class="n">basePath</span> <span class="o">+</span> <span class="s">"/*/*/*/*"</span><span class="o">)</span>
+<span class="nv">roViewDF</span><span class="o">.</span><span class="py">registerTempTable</span><span class="o">(</span><span class="s">"hudi_ro_table"</span><span class="o">)</span>
+<span class="nv">spark</span><span class="o">.</span><span class="py">sql</span><span class="o">(</span><span class="s">"select fare, begin_lon, begin_lat, ts from  hudi_ro_table where fare &gt; 20.0"</span><span class="o">).</span><span class="py">show</span><span class="o">()</span>
+<span class="nv">spark</span><span class="o">.</span><span class="py">sql</span><span class="o">(</span><span class="s">"select _hoodie_commit_time, _hoodie_record_key, _hoodie_partition_path, rider, driver, fare from  hudi_ro_table"</span><span class="o">).</span><span class="py">show</span><span class="o">()</span>
+</code></pre></div></div>
+
+<p>该查询提供已提取数据的读取优化视图。由于我们的分区路径(<code class="highlighter-rouge">region/country/city</code>)是嵌套的3个级别
+从基本路径开始,我们使用了<code class="highlighter-rouge">load(basePath + "/*/*/*/*")</code>。
+有关支持的所有存储类型和视图的更多信息,请参考<a href="/cn/docs/0.5.1-concepts.html#storage-types--views">存储类型和视图</a>。</p>
+
+<h2 id="updates">更新数据</h2>
+
+<p>这类似于插入新数据。使用数据生成器生成对现有行程的更新,加载到DataFrame中并将DataFrame写入hudi数据集。</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="nv">updates</span> <span class="k">=</span> <span class="nf">convertToStringList</span><span class="o">(</span><span class="nv">dataGen</span><span class="o">.</span><span class="py">generateUpdates</span><span class="o">(</span><span class="mi">10</span><span class="o">))</span>
+<span class="k">val</span> <span class="nv">df</span> <span class="k">=</span> <span class="nv">spark</span><span class="o">.</span><span class="py">read</span><span class="o">.</span><span class="py">json</span><span class="o">(</span><span class="nv">spark</span><span class="o">.</span><span class="py">sparkContext</span><span class="o">.</span><span class="py">parallelize</span><span class="o">(</span><span class="n">updates</span><span class="o">,</span> <span class="mi">2</span><spa [...]
+<span class="nv">df</span><span class="o">.</span><span class="py">write</span><span class="o">.</span><span class="py">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span>
+    <span class="nf">options</span><span class="o">(</span><span class="n">getQuickstartWriteConfigs</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">PRECOMBINE_FIELD_OPT_KEY</span><span class="o">,</span> <span class="s">"ts"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">RECORDKEY_FIELD_OPT_KEY</span><span class="o">,</span> <span class="s">"uuid"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">PARTITIONPATH_FIELD_OPT_KEY</span><span class="o">,</span> <span class="s">"partitionpath"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">TABLE_NAME</span><span class="o">,</span> <span class="n">tableName</span><span class="o">).</span>
+    <span class="nf">mode</span><span class="o">(</span><span class="nc">Append</span><span class="o">).</span>
+    <span class="nf">save</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+</code></pre></div></div>
+
+<p>注意,保存模式现在为<code class="highlighter-rouge">追加</code>。通常,除非您是第一次尝试创建数据集,否则请始终使用追加模式。
+<a href="#query">查询</a>现在再次查询数据将显示更新的行程。每个写操作都会生成一个新的由时间戳表示的<a href="/cn/docs/0.5.1-concepts.html">commit</a>
+。在之前提交的相同的<code class="highlighter-rouge">_hoodie_record_key</code>中寻找<code class="highlighter-rouge">_hoodie_commit_time</code>, <code class="highlighter-rouge">rider</code>, <code class="highlighter-rouge">driver</code>字段变更。</p>
+
+<h2 id="增量查询">增量查询</h2>
+
+<p>Hudi还提供了获取给定提交时间戳以来已更改的记录流的功能。
+这可以通过使用Hudi的增量视图并提供所需更改的开始时间来实现。
+如果我们需要给定提交之后的所有更改(这是常见的情况),则无需指定结束时间。</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// reload data
+</span><span class="n">spark</span><span class="o">.</span>
+    <span class="n">read</span><span class="o">.</span>
+    <span class="nf">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span>
+    <span class="nf">load</span><span class="o">(</span><span class="n">basePath</span> <span class="o">+</span> <span class="s">"/*/*/*/*"</span><span class="o">).</span>
+    <span class="nf">createOrReplaceTempView</span><span class="o">(</span><span class="s">"hudi_ro_table"</span><span class="o">)</span>
+
+<span class="k">val</span> <span class="nv">commits</span> <span class="k">=</span> <span class="nv">spark</span><span class="o">.</span><span class="py">sql</span><span class="o">(</span><span class="s">"select distinct(_hoodie_commit_time) as commitTime from  hudi_ro_table order by commitTime"</span><span class="o">).</span><span class="py">map</span><span class="o">(</span><span class="n">k</span> <span class="k">=&gt;</span> <span class="nv">k</span><span class="o">.</span><span clas [...]
+<span class="k">val</span> <span class="nv">beginTime</span> <span class="k">=</span> <span class="nf">commits</span><span class="o">(</span><span class="nv">commits</span><span class="o">.</span><span class="py">length</span> <span class="o">-</span> <span class="mi">2</span><span class="o">)</span> <span class="c1">// commit time we are interested in
+</span>
+<span class="c1">// 增量查询数据
+</span><span class="k">val</span> <span class="nv">incViewDF</span> <span class="k">=</span> <span class="n">spark</span><span class="o">.</span>
+    <span class="n">read</span><span class="o">.</span>
+    <span class="nf">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">VIEW_TYPE_OPT_KEY</span><span class="o">,</span> <span class="nc">VIEW_TYPE_INCREMENTAL_OPT_VAL</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">BEGIN_INSTANTTIME_OPT_KEY</span><span class="o">,</span> <span class="n">beginTime</span><span class="o">).</span>
+    <span class="nf">load</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+<span class="nv">incViewDF</span><span class="o">.</span><span class="py">registerTempTable</span><span class="o">(</span><span class="s">"hudi_incr_table"</span><span class="o">)</span>
+<span class="nv">spark</span><span class="o">.</span><span class="py">sql</span><span class="o">(</span><span class="s">"select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare &gt; 20.0"</span><span class="o">).</span><span class="py">show</span><span class="o">()</span>
+</code></pre></div></div>
+
+<p>这将提供在开始时间提交之后发生的所有更改,其中包含票价大于20.0的过滤器。关于此功能的独特之处在于,它现在使您可以在批量数据上创作流式管道。</p>
+
+<h2 id="特定时间点查询">特定时间点查询</h2>
+
+<p>让我们看一下如何查询特定时间的数据。可以通过将结束时间指向特定的提交时间,将开始时间指向”000”(表示最早的提交时间)来表示特定时间。</p>
+
+<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="nv">beginTime</span> <span class="k">=</span> <span class="s">"000"</span> <span class="c1">// Represents all commits &gt; this time.
+</span><span class="k">val</span> <span class="nv">endTime</span> <span class="k">=</span> <span class="nf">commits</span><span class="o">(</span><span class="nv">commits</span><span class="o">.</span><span class="py">length</span> <span class="o">-</span> <span class="mi">2</span><span class="o">)</span> <span class="c1">// commit time we are interested in
+</span>
+<span class="c1">// 增量查询数据
+</span><span class="k">val</span> <span class="nv">incViewDF</span> <span class="k">=</span> <span class="nv">spark</span><span class="o">.</span><span class="py">read</span><span class="o">.</span><span class="py">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">VIEW_TYPE_OPT_KEY</span><span class="o">,</span> <span class="nc">VIEW_TYPE_INCREMENTAL_OPT_VAL</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">BEGIN_INSTANTTIME_OPT_KEY</span><span class="o">,</span> <span class="n">beginTime</span><span class="o">).</span>
+    <span class="nf">option</span><span class="o">(</span><span class="nc">END_INSTANTTIME_OPT_KEY</span><span class="o">,</span> <span class="n">endTime</span><span class="o">).</span>
+    <span class="nf">load</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+<span class="nv">incViewDF</span><span class="o">.</span><span class="py">registerTempTable</span><span class="o">(</span><span class="s">"hudi_incr_table"</span><span class="o">)</span>
+<span class="nv">spark</span><span class="o">.</span><span class="py">sql</span><span class="o">(</span><span class="s">"select `_hoodie_commit_time`, fare, begin_lon, begin_lat, ts from  hudi_incr_table where fare &gt; 20.0"</span><span class="o">).</span><span class="py">show</span><span class="o">()</span>
+</code></pre></div></div>
+
+<h2 id="从这开始下一步">从这开始下一步?</h2>
+
+<p>您也可以通过<a href="https://github.com/apache/incubator-hudi#building-apache-hudi-from-source">自己构建hudi</a>来快速开始,
+并在spark-shell命令中使用<code class="highlighter-rouge">--jars &lt;path to hudi_code&gt;/packaging/hudi-spark-bundle/target/hudi-spark-bundle-*.*.*-SNAPSHOT.jar</code>,
+而不是<code class="highlighter-rouge">--packages org.apache.hudi:hudi-spark-bundle:0.5.0-incubating</code></p>
+
+<p>这里我们使用Spark演示了Hudi的功能。但是,Hudi可以支持多种存储类型/视图,并且可以从Hive,Spark,Presto等查询引擎中查询Hudi数据集。
+我们制作了一个基于Docker设置、所有依赖系统都在本地运行的<a href="https://www.youtube.com/watch?v=VhNgUsxdrD0">演示视频</a>,
+我们建议您复制相同的设置然后按照<a href="/cn/docs/0.5.1-docker_demo.html">这里</a>的步骤自己运行这个演示。
+另外,如果您正在寻找将现有数据迁移到Hudi的方法,请参考<a href="/cn/docs/0.5.1-migration_guide.html">迁移指南</a>。</p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/cn/docs/0.5.0-docs-versions.html b/content/cn/docs/0.5.1-s3_hoodie.html
similarity index 55%
copy from content/cn/docs/0.5.0-docs-versions.html
copy to content/cn/docs/0.5.1-s3_hoodie.html
index da9849f..727fadf 100644
--- a/content/cn/docs/0.5.0-docs-versions.html
+++ b/content/cn/docs/0.5.1-s3_hoodie.html
@@ -3,17 +3,17 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<!-- begin _includes/seo.html --><title>S3 Filesystem - Apache Hudi</title>
+<meta name="description" content="In this page, we explain how to get your Hudi spark job to store into AWS S3.">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="文档版本">
-<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.0-docs-versions.html">
+<meta property="og:title" content="S3 Filesystem">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-s3_hoodie.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="In this page, we explain how to get your Hudi spark job to store into AWS S3.">
 
 
 
@@ -147,7 +147,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-quick-start-guide.html" class="">快速开始</a></li>
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
             
 
           
@@ -158,7 +158,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-use_cases.html" class="">使用案例</a></li>
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
             
 
           
@@ -169,7 +169,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-powered_by.html" class="">演讲 & hudi 用户</a></li>
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
             
 
           
@@ -180,7 +180,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-comparison.html" class="">对比</a></li>
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
             
 
           
@@ -191,7 +191,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docker_demo.html" class="">Docker 示例</a></li>
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
             
 
           
@@ -214,7 +214,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-concepts.html" class="">概念</a></li>
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
             
 
           
@@ -225,7 +225,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-writing_data.html" class="">写入数据</a></li>
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
             
 
           
@@ -236,7 +236,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-querying_data.html" class="">查询数据</a></li>
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
             
 
           
@@ -247,7 +247,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-configurations.html" class="">配置</a></li>
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
             
 
           
@@ -258,7 +258,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-performance.html" class="">性能</a></li>
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
             
 
           
@@ -269,7 +269,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-admin_guide.html" class="">管理</a></li>
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
             
 
           
@@ -292,7 +292,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-docs-versions.html" class="active">文档版本</a></li>
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
             
 
           
@@ -303,7 +303,7 @@
             
 
             
-              <li><a href="/cn/docs/0.5.0-privacy.html" class="">版权信息</a></li>
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
             
 
           
@@ -324,7 +324,7 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">文档版本
+          <h1 id="page-title" class="page__title" itemprop="headline">S3 Filesystem
 </h1>
         </header>
       
@@ -337,23 +337,81 @@
             }
           </style>
         
-        <table class="docversions">
-    <tbody>
-      
-        <tr>
-            <th>Latest</th>
-            <td><a href="/docs/quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-        <tr>
-            <th>0.5.0</th>
-            <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
-            <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
-        </tr>
-      
-    </tbody>
-</table>
+        <p>In this page, we explain how to get your Hudi spark job to store into AWS S3.</p>
+
+<h2 id="aws-configs">AWS configs</h2>
+
+<p>There are two configurations required for Hudi-S3 compatibility:</p>
+
+<ul>
+  <li>Adding AWS Credentials for Hudi</li>
+  <li>Adding required Jars to classpath</li>
+</ul>
+
+<h3 id="aws-credentials">AWS Credentials</h3>
+
+<p>Simplest way to use Hudi with S3, is to configure your <code class="highlighter-rouge">SparkSession</code> or <code class="highlighter-rouge">SparkContext</code> with S3 credentials. Hudi will automatically pick this up and talk to S3.</p>
+
+<p>Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the <code class="highlighter-rouge">fs.defaultFS</code> with your S3 bucket name and Hudi should be able to read/write from the bucket.</p>
+
+<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.defaultFS<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>s3://ysharma<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.s3.impl<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>org.apache.hadoop.fs.s3native.NativeS3FileSystem<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+      <span class="nt">&lt;name&gt;</span>fs.s3.awsAccessKeyId<span class="nt">&lt;/name&gt;</span>
+      <span class="nt">&lt;value&gt;</span>AWS_KEY<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3.awsSecretAccessKey<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_SECRET<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3n.awsAccessKeyId<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_KEY<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+
+  <span class="nt">&lt;property&gt;</span>
+       <span class="nt">&lt;name&gt;</span>fs.s3n.awsSecretAccessKey<span class="nt">&lt;/name&gt;</span>
+       <span class="nt">&lt;value&gt;</span>AWS_SECRET<span class="nt">&lt;/value&gt;</span>
+  <span class="nt">&lt;/property&gt;</span>
+</code></pre></div></div>
+
+<p>Utilities such as hudi-cli or deltastreamer tool, can pick up s3 creds via environmental variable prefixed with <code class="highlighter-rouge">HOODIE_ENV_</code>. For e.g below is a bash snippet to setup
+such variables and then have cli be able to work on datasets stored in s3</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3a_DOT_access_DOT_key</span><span class="o">=</span><span class="n">$accessKey</span>
+<span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3a_DOT_secret_DOT_key</span><span class="o">=</span><span class="n">$secretKey</span>
+<span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3_DOT_awsAccessKeyId</span><span class="o">=</span><span class="n">$accessKey</span>
+<span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3_DOT_awsSecretAccessKey</span><span class="o">=</span><span class="n">$secretKey</span>
+<span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3n_DOT_awsAccessKeyId</span><span class="o">=</span><span class="n">$accessKey</span>
+<span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3n_DOT_awsSecretAccessKey</span><span class="o">=</span><span class="n">$secretKey</span>
+<span class="n">export</span> <span class="n">HOODIE_ENV_fs_DOT_s3n_DOT_impl</span><span class="o">=</span><span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hadoop</span><span class="o">.</span><span class="na">fs</span><span class="o">.</span><span class="na">s3a</span><span class="o">.</span><span class="na">S3AFileSystem</span>
+</code></pre></div></div>
+
+<h3 id="aws-libs">AWS Libs</h3>
+
+<p>AWS hadoop libraries to add to our classpath</p>
+
+<ul>
+  <li>com.amazonaws:aws-java-sdk:1.10.34</li>
+  <li>org.apache.hadoop:hadoop-aws:2.7.3</li>
+</ul>
+
+<p>AWS glue data libraries are needed if AWS glue data is used</p>
+
+<ul>
+  <li>com.amazonaws.glue:aws-glue-datacatalog-hive2-client:1.11.0</li>
+  <li>com.amazonaws:aws-java-sdk-glue:1.11.475</li>
+</ul>
 
       </section>
 
diff --git a/content/cn/docs/0.5.1-use_cases.html b/content/cn/docs/0.5.1-use_cases.html
new file mode 100644
index 0000000..e52dc9d
--- /dev/null
+++ b/content/cn/docs/0.5.1-use_cases.html
@@ -0,0 +1,445 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>使用案例 - Apache Hudi</title>
+<meta name="description" content="以下是一些使用Hudi的示例,说明了加快处理速度和提高效率的好处">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="使用案例">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-use_cases.html">
+
+
+  <meta property="og:description" content="以下是一些使用Hudi的示例,说明了加快处理速度和提高效率的好处">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="active">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">使用案例
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#近实时摄取">近实时摄取</a></li>
+  <li><a href="#近实时分析">近实时分析</a></li>
+  <li><a href="#增量处理管道">增量处理管道</a></li>
+  <li><a href="#dfs的数据分发">DFS的数据分发</a></li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>以下是一些使用Hudi的示例,说明了加快处理速度和提高效率的好处</p>
+
+<h2 id="近实时摄取">近实时摄取</h2>
+
+<p>将外部源(如事件日志、数据库、外部源)的数据摄取到<a href="http://martinfowler.com/bliki/DataLake.html">Hadoop数据湖</a>是一个众所周知的问题。
+尽管这些数据对整个组织来说是最有价值的,但不幸的是,在大多数(如果不是全部)Hadoop部署中都使用零散的方式解决,即使用多个不同的摄取工具。</p>
+
+<p>对于RDBMS摄取,Hudi提供 <strong>通过更新插入达到更快加载</strong>,而不是昂贵且低效的批量加载。例如,您可以读取MySQL BIN日志或<a href="https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_incremental_imports">Sqoop增量导入</a>并将其应用于
+DFS上的等效Hudi表。这比<a href="https://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html#id1770457">批量合并任务</a>及<a href="http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/">复杂的手工合并工作流</a>更快/更有效率。</p>
+
+<p>对于NoSQL数据存储,如<a href="http://cassandra.apache.org/">Cassandra</a> / <a href="http://www.project-voldemort.com/voldemort/">Voldemort</a> / <a href="https://hbase.apache.org/">HBase</a>,即使是中等规模大小也会存储数十亿行。
+毫无疑问, <strong>全量加载不可行</strong>,如果摄取需要跟上较高的更新量,那么则需要更有效的方法。</p>
+
+<p>即使对于像<a href="kafka.apache.org">Kafka</a>这样的不可变数据源,Hudi也可以 <strong>强制在HDFS上使用最小文件大小</strong>, 这采取了综合方式解决<a href="https://blog.cloudera.com/blog/2009/02/the-small-files-problem/">HDFS小文件问题</a>来改善NameNode的健康状况。这对事件流来说更为重要,因为它通常具有较高容量(例如:点击流),如果管理不当,可能会对Hadoop集群造成严重损害。</p>
+
+<p>在所有源中,通过<code class="highlighter-rouge">commits</code>这一概念,Hudi增加了以原子方式向消费者发布新数据的功能,这种功能十分必要。</p>
+
+<h2 id="近实时分析">近实时分析</h2>
+
+<p>通常,实时<a href="https://en.wikipedia.org/wiki/Data_mart">数据集市</a>由专业(实时)数据分析存储提供支持,例如<a href="http://druid.io/">Druid</a>或<a href="http://www.memsql.com/">Memsql</a>或<a href="http://opentsdb.net/">OpenTSDB</a>。
+这对于较小规模的数据量来说绝对是完美的(<a href="https://blog.twitter.com/2015/hadoop-filesystem-at-twitter">相比于这样安装Hadoop</a>),这种情况需要在亚秒级响应查询,例如系统监控或交互式实时分析。
+但是,由于Hadoop上的数据太陈旧了,通常这些系统会被滥用于非交互式查询,这导致利用率不足和硬件/许可证成本的浪费。</p>
+
+<p>另一方面,Hadoop上的交互式SQL解决方案(如Presto和SparkSQL)表现出色,在 <strong>几秒钟内完成查询</strong>。
+通过将 <strong>数据新鲜度提高到几分钟</strong>,Hudi可以提供一个更有效的替代方案,并支持存储在DFS中的 <strong>数量级更大的数据集</strong> 的实时分析。
+此外,Hudi没有外部依赖(如专用于实时分析的HBase集群),因此可以在更新的分析上实现更快的分析,而不会增加操作开销。</p>
+
+<h2 id="增量处理管道">增量处理管道</h2>
+
+<p>Hadoop提供的一个基本能力是构建一系列数据集,这些数据集通过表示为工作流的DAG相互派生。
+工作流通常取决于多个上游工作流输出的新数据,新数据的可用性传统上由新的DFS文件夹/Hive分区指示。
+让我们举一个具体的例子来说明这点。上游工作流<code class="highlighter-rouge">U</code>可以每小时创建一个Hive分区,在每小时结束时(processing_time)使用该小时的数据(event_time),提供1小时的有效新鲜度。
+然后,下游工作流<code class="highlighter-rouge">D</code>在<code class="highlighter-rouge">U</code>结束后立即启动,并在下一个小时内自行处理,将有效延迟时间增加到2小时。</p>
+
+<p>上面的示例忽略了迟到的数据,即<code class="highlighter-rouge">processing_time</code>和<code class="highlighter-rouge">event_time</code>分开时。
+不幸的是,在今天的后移动和前物联网世界中,<strong>来自间歇性连接的移动设备和传感器的延迟数据是常态,而不是异常</strong>。
+在这种情况下,保证正确性的唯一补救措施是<a href="https://falcon.apache.org/FalconDocumentation.html#Handling_late_input_data">重新处理最后几个小时</a>的数据,
+每小时一遍又一遍,这可能会严重影响整个生态系统的效率。例如; 试想一下,在数百个工作流中每小时重新处理TB数据。</p>
+
+<p>Hudi通过以单个记录为粒度的方式(而不是文件夹/分区)从上游 Hudi数据集<code class="highlighter-rouge">HU</code>消费新数据(包括迟到数据),来解决上面的问题。
+应用处理逻辑,并使用下游Hudi数据集<code class="highlighter-rouge">HD</code>高效更新/协调迟到数据。在这里,<code class="highlighter-rouge">HU</code>和<code class="highlighter-rouge">HD</code>可以以更频繁的时间被连续调度
+比如15分钟,并且<code class="highlighter-rouge">HD</code>提供端到端30分钟的延迟。</p>
+
+<p>为实现这一目标,Hudi采用了类似于<a href="https://spark.apache.org/docs/latest/streaming-programming-guide.html#join-operations">Spark Streaming</a>、发布/订阅系统等流处理框架,以及像<a href="http://kafka.apache.org/documentation/#theconsumer">Kafka</a>
+或<a href="https://docs.oracle.com/cd/E11882_01/server.112/e16545/xstrm_cncpt.htm#XSTRM187">Oracle XStream</a>等数据库复制技术的类似概念。
+如果感兴趣,可以在<a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">这里</a>找到有关增量处理(相比于流处理和批处理)好处的更详细解释。</p>
+
+<h2 id="dfs的数据分发">DFS的数据分发</h2>
+
+<p>一个常用场景是先在Hadoop上处理数据,然后将其分发回在线服务存储层,以供应用程序使用。
+例如,一个Spark管道可以<a href="https://eng.uber.com/telematics/">确定Hadoop上的紧急制动事件</a>并将它们加载到服务存储层(如ElasticSearch)中,供Uber应用程序使用以增加安全驾驶。这种用例中,通常架构会在Hadoop和服务存储之间引入<code class="highlighter-rouge">队列</code>,以防止目标服务存储被压垮。
+对于队列的选择,一种流行的选择是Kafka,这个模型经常导致 <strong>在DFS上存储相同数据的冗余(用于计算结果的离线分析)和Kafka(用于分发)</strong></p>
+
+<p>通过将每次运行的Spark管道更新插入的输出转换为Hudi数据集,Hudi可以再次有效地解决这个问题,然后可以以增量方式获取尾部数据(就像Kafka topic一样)然后写入服务存储层。</p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/cn/docs/0.5.1-writing_data.html b/content/cn/docs/0.5.1-writing_data.html
new file mode 100644
index 0000000..d9dfbea
--- /dev/null
+++ b/content/cn/docs/0.5.1-writing_data.html
@@ -0,0 +1,608 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>写入 Hudi 数据集 - Apache Hudi</title>
+<meta name="description" content="这一节我们将介绍使用DeltaStreamer工具从外部源甚至其他Hudi数据集摄取新更改的方法,以及通过使用Hudi数据源的upserts加快大型Spark作业的方法。对于此类数据集,我们可以使用各种查询引擎查询它们。">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="写入 Hudi 数据集">
+<meta property="og:url" content="https://hudi.apache.org/cn/docs/0.5.1-writing_data.html">
+
+
+  <meta property="og:description" content="这一节我们将介绍使用DeltaStreamer工具从外部源甚至其他Hudi数据集摄取新更改的方法,以及通过使用Hudi数据源的upserts加快大型Spark作业的方法。对于此类数据集,我们可以使用各种查询引擎查询它们。">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/community.html" target="_self" >社区</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/activity.html" target="_self" >动态</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/cn/releases.html" target="_self" >发布</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">文档菜单</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">入门指南</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-quick-start-guide.html" class="">快速开始</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-use_cases.html" class="">使用案例</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-powered_by.html" class="">演讲 & hudi 用户</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-comparison.html" class="">对比</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docker_demo.html" class="">Docker 示例</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">帮助文档</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-concepts.html" class="">概念</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-writing_data.html" class="active">写入数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-querying_data.html" class="">查询数据</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-configurations.html" class="">配置</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-performance.html" class="">性能</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-deployment.html" class="">管理</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">其他信息</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-docs-versions.html" class="">文档版本</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/cn/docs/0.5.1-privacy.html" class="">版权信息</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">写入 Hudi 数据集
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#写操作">写操作</a></li>
+  <li><a href="#deltastreamer">DeltaStreamer</a></li>
+  <li><a href="#datasource-writer">Datasource Writer</a></li>
+  <li><a href="#与hive同步">与Hive同步</a></li>
+  <li><a href="#删除数据">删除数据</a></li>
+  <li><a href="#存储管理">存储管理</a></li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>这一节我们将介绍使用<a href="#deltastreamer">DeltaStreamer</a>工具从外部源甚至其他Hudi数据集摄取新更改的方法,
+以及通过使用<a href="#datasource-writer">Hudi数据源</a>的upserts加快大型Spark作业的方法。
+对于此类数据集,我们可以使用各种查询引擎<a href="/cn/docs/0.5.1-querying_data.html">查询</a>它们。</p>
+
+<h2 id="写操作">写操作</h2>
+
+<p>在此之前,了解Hudi数据源及delta streamer工具提供的三种不同的写操作以及如何最佳利用它们可能会有所帮助。
+这些操作可以在针对数据集发出的每个提交/增量提交中进行选择/更改。</p>
+
+<ul>
+  <li><strong>UPSERT(插入更新)</strong> :这是默认操作,在该操作中,通过查找索引,首先将输入记录标记为插入或更新。
+ 在运行启发式方法以确定如何最好地将这些记录放到存储上,如优化文件大小之类后,这些记录最终会被写入。
+ 对于诸如数据库更改捕获之类的用例,建议该操作,因为输入几乎肯定包含更新。</li>
+  <li><strong>INSERT(插入)</strong> :就使用启发式方法确定文件大小而言,此操作与插入更新(UPSERT)非常相似,但此操作完全跳过了索引查找步骤。
+ 因此,对于日志重复数据删除等用例(结合下面提到的过滤重复项的选项),它可以比插入更新快得多。
+ 插入也适用于这种用例,这种情况数据集可以允许重复项,但只需要Hudi的事务写/增量提取/存储管理功能。</li>
+  <li><strong>BULK_INSERT(批插入)</strong> :插入更新和插入操作都将输入记录保存在内存中,以加快存储优化启发式计算的速度(以及其它未提及的方面)。
+ 所以对Hudi数据集进行初始加载/引导时这两种操作会很低效。批量插入提供与插入相同的语义,但同时实现了基于排序的数据写入算法,
+ 该算法可以很好地扩展数百TB的初始负载。但是,相比于插入和插入更新能保证文件大小,批插入在调整文件大小上只能尽力而为。</li>
+</ul>
+
+<h2 id="deltastreamer">DeltaStreamer</h2>
+
+<p><code class="highlighter-rouge">HoodieDeltaStreamer</code>实用工具 (hudi-utilities-bundle中的一部分) 提供了从DFS或Kafka等不同来源进行摄取的方式,并具有以下功能。</p>
+
+<ul>
+  <li>从Kafka单次摄取新事件,从Sqoop、HiveIncrementalPuller输出或DFS文件夹中的多个文件
+ <a href="https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_incremental_imports">增量导入</a></li>
+  <li>支持json、avro或自定义记录类型的传入数据</li>
+  <li>管理检查点,回滚和恢复</li>
+  <li>利用DFS或Confluent <a href="https://github.com/confluentinc/schema-registry">schema注册表</a>的Avro模式。</li>
+  <li>支持自定义转换操作</li>
+</ul>
+
+<p>命令行选项更详细地描述了这些功能:</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span><span class="n">hoodie</span><span class="o">]</span><span class="err">$</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</sp [...]
+<span class="nl">Usage:</span> <span class="o">&lt;</span><span class="n">main</span> <span class="kd">class</span><span class="err">&gt;</span> <span class="err">[</span><span class="nc">options</span><span class="o">]</span>
+  <span class="nl">Options:</span>
+    <span class="o">--</span><span class="n">commit</span><span class="o">-</span><span class="n">on</span><span class="o">-</span><span class="n">errors</span>
+        <span class="nc">Commit</span> <span class="n">even</span> <span class="n">when</span> <span class="n">some</span> <span class="n">records</span> <span class="n">failed</span> <span class="n">to</span> <span class="n">be</span> <span class="n">written</span>
+      <span class="nl">Default:</span> <span class="kc">false</span>
+    <span class="o">--</span><span class="n">enable</span><span class="o">-</span><span class="n">hive</span><span class="o">-</span><span class="n">sync</span>
+          <span class="nc">Enable</span> <span class="n">syncing</span> <span class="n">to</span> <span class="n">hive</span>
+       <span class="nl">Default:</span> <span class="kc">false</span>
+    <span class="o">--</span><span class="n">filter</span><span class="o">-</span><span class="n">dupes</span>
+          <span class="nc">Should</span> <span class="n">duplicate</span> <span class="n">records</span> <span class="n">from</span> <span class="n">source</span> <span class="n">be</span> <span class="n">dropped</span><span class="o">/</span><span class="n">filtered</span> <span class="n">outbefore</span> 
+          <span class="n">insert</span><span class="o">/</span><span class="n">bulk</span><span class="o">-</span><span class="n">insert</span> 
+      <span class="nl">Default:</span> <span class="kc">false</span>
+    <span class="o">--</span><span class="n">help</span><span class="o">,</span> <span class="o">-</span><span class="n">h</span>
+    <span class="o">--</span><span class="n">hudi</span><span class="o">-</span><span class="n">conf</span>
+          <span class="nc">Any</span> <span class="n">configuration</span> <span class="n">that</span> <span class="n">can</span> <span class="n">be</span> <span class="n">set</span> <span class="n">in</span> <span class="n">the</span> <span class="n">properties</span> <span class="nf">file</span> <span class="o">(</span><span class="n">using</span> <span class="n">the</span> <span class="no">CLI</span> 
+          <span class="n">parameter</span> <span class="s">"--propsFilePath"</span><span class="o">)</span> <span class="n">can</span> <span class="n">also</span> <span class="n">be</span> <span class="n">passed</span> <span class="n">command</span> <span class="n">line</span> <span class="n">using</span> <span class="k">this</span> 
+          <span class="n">parameter</span> 
+          <span class="nl">Default:</span> <span class="o">[]</span>
+    <span class="o">--</span><span class="n">op</span>
+      <span class="nc">Takes</span> <span class="n">one</span> <span class="n">of</span> <span class="n">these</span> <span class="n">values</span> <span class="o">:</span> <span class="no">UPSERT</span> <span class="o">(</span><span class="k">default</span><span class="o">),</span> <span class="no">INSERT</span> <span class="o">(</span><span class="n">use</span> <span class="n">when</span> <span class="n">input</span> <span class="n">is</span>
+      <span class="n">purely</span> <span class="k">new</span> <span class="n">data</span><span class="o">/</span><span class="n">inserts</span> <span class="n">to</span> <span class="n">gain</span> <span class="n">speed</span><span class="o">)</span>
+      <span class="nl">Default:</span> <span class="no">UPSERT</span>
+      <span class="nc">Possible</span> <span class="nl">Values:</span> <span class="o">[</span><span class="no">UPSERT</span><span class="o">,</span> <span class="no">INSERT</span><span class="o">,</span> <span class="no">BULK_INSERT</span><span class="o">]</span>
+    <span class="o">--</span><span class="n">payload</span><span class="o">-</span><span class="kd">class</span>
+      <span class="nc">subclass</span> <span class="n">of</span> <span class="nc">HoodieRecordPayload</span><span class="o">,</span> <span class="n">that</span> <span class="n">works</span> <span class="n">off</span> <span class="n">a</span> <span class="nc">GenericRecord</span><span class="o">.</span>
+      <span class="nc">Implement</span> <span class="n">your</span> <span class="n">own</span><span class="o">,</span> <span class="k">if</span> <span class="n">you</span> <span class="n">want</span> <span class="n">to</span> <span class="k">do</span> <span class="n">something</span> <span class="n">other</span> <span class="n">than</span> <span class="n">overwriting</span>
+      <span class="n">existing</span> <span class="n">value</span>
+      <span class="nl">Default:</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">OverwriteWithLatestAvroPayload</span>
+    <span class="o">--</span><span class="n">props</span>
+      <span class="n">path</span> <span class="n">to</span> <span class="n">properties</span> <span class="n">file</span> <span class="n">on</span> <span class="n">localfs</span> <span class="n">or</span> <span class="n">dfs</span><span class="o">,</span> <span class="n">with</span> <span class="n">configurations</span> <span class="k">for</span>
+      <span class="nc">Hudi</span> <span class="n">client</span><span class="o">,</span> <span class="n">schema</span> <span class="n">provider</span><span class="o">,</span> <span class="n">key</span> <span class="n">generator</span> <span class="n">and</span> <span class="n">data</span> <span class="n">source</span><span class="o">.</span> <span class="nc">For</span>
+      <span class="nc">Hudi</span> <span class="n">client</span> <span class="n">props</span><span class="o">,</span> <span class="n">sane</span> <span class="n">defaults</span> <span class="n">are</span> <span class="n">used</span><span class="o">,</span> <span class="n">but</span> <span class="n">recommend</span> <span class="n">use</span> <span class="n">to</span>
+      <span class="n">provide</span> <span class="n">basic</span> <span class="n">things</span> <span class="n">like</span> <span class="n">metrics</span> <span class="n">endpoints</span><span class="o">,</span> <span class="n">hive</span> <span class="n">configs</span> <span class="n">etc</span><span class="o">.</span> <span class="nc">For</span>
+      <span class="n">sources</span><span class="o">,</span> <span class="n">referto</span> <span class="n">individual</span> <span class="n">classes</span><span class="o">,</span> <span class="k">for</span> <span class="n">supported</span> <span class="n">properties</span><span class="o">.</span>
+      <span class="nl">Default:</span> <span class="nl">file:</span><span class="c1">///Users/vinoth/bin/hoodie/src/test/resources/delta-streamer-config/dfs-source.properties</span>
+    <span class="o">--</span><span class="n">schemaprovider</span><span class="o">-</span><span class="kd">class</span>
+      <span class="nc">subclass</span> <span class="n">of</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">schema</span><span class="o">.</span><span class="na">SchemaProvider</span> <span class="n">to</span> <span class="n">attach</span>
+      <span class="n">schemas</span> <span class="n">to</span> <span class="n">input</span> <span class="o">&amp;</span> <span class="n">target</span> <span class="n">table</span> <span class="n">data</span><span class="o">,</span> <span class="n">built</span> <span class="n">in</span> <span class="nl">options:</span>
+      <span class="nc">FilebasedSchemaProvider</span>
+      <span class="nl">Default:</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">schema</span><span class="o">.</span><span class="na">FilebasedSchemaProvider</span>
+    <span class="o">--</span><span class="n">source</span><span class="o">-</span><span class="kd">class</span>
+      <span class="nc">Subclass</span> <span class="n">of</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">sources</span> <span class="n">to</span> <span class="n">read</span> <span class="n">data</span><span class="o">.</span> <span class="nc">Built</span><span class="o">-</span><span class="n">in</span>
+      <span class="nl">options:</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">sources</span><span class="o">.{</span><span class="nc">JsonDFSSource</span> <span class="o">(</span><span class="k">default</span><span class="o">),</span>
+      <span class="nc">AvroDFSSource</span><span class="o">,</span> <span class="nc">JsonKafkaSource</span><span class="o">,</span> <span class="nc">AvroKafkaSource</span><span class="o">,</span> <span class="nc">HiveIncrPullSource</span><span class="o">}</span>
+      <span class="nl">Default:</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">sources</span><span class="o">.</span><span class="na">JsonDFSSource</span>
+    <span class="o">--</span><span class="n">source</span><span class="o">-</span><span class="n">limit</span>
+      <span class="nc">Maximum</span> <span class="n">amount</span> <span class="n">of</span> <span class="n">data</span> <span class="n">to</span> <span class="n">read</span> <span class="n">from</span> <span class="n">source</span><span class="o">.</span> <span class="nl">Default:</span> <span class="nc">No</span> <span class="n">limit</span> <span class="nc">For</span> <span class="n">e</span><span class="o">.</span><span class="na">g</span><span class="o">:</span>
+      <span class="nc">DFSSource</span> <span class="o">=&gt;</span> <span class="n">max</span> <span class="n">bytes</span> <span class="n">to</span> <span class="n">read</span><span class="o">,</span> <span class="nc">KafkaSource</span> <span class="o">=&gt;</span> <span class="n">max</span> <span class="n">events</span> <span class="n">to</span> <span class="n">read</span>
+      <span class="nl">Default:</span> <span class="mi">9223372036854775807</span>
+    <span class="o">--</span><span class="n">source</span><span class="o">-</span><span class="n">ordering</span><span class="o">-</span><span class="n">field</span>
+      <span class="nc">Field</span> <span class="n">within</span> <span class="n">source</span> <span class="n">record</span> <span class="n">to</span> <span class="n">decide</span> <span class="n">how</span> <span class="n">to</span> <span class="k">break</span> <span class="n">ties</span> <span class="n">between</span> <span class="n">records</span>
+      <span class="n">with</span> <span class="n">same</span> <span class="n">key</span> <span class="n">in</span> <span class="n">input</span> <span class="n">data</span><span class="o">.</span> <span class="nl">Default:</span> <span class="err">'</span><span class="n">ts</span><span class="err">'</span> <span class="n">holding</span> <span class="n">unix</span> <span class="n">timestamp</span> <span class="n">of</span>
+      <span class="n">record</span>
+      <span class="nl">Default:</span> <span class="n">ts</span>
+    <span class="o">--</span><span class="n">spark</span><span class="o">-</span><span class="n">master</span>
+      <span class="n">spark</span> <span class="n">master</span> <span class="n">to</span> <span class="n">use</span><span class="o">.</span>
+      <span class="nl">Default:</span> <span class="n">local</span><span class="o">[</span><span class="mi">2</span><span class="o">]</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">target</span><span class="o">-</span><span class="n">base</span><span class="o">-</span><span class="n">path</span>
+      <span class="n">base</span> <span class="n">path</span> <span class="k">for</span> <span class="n">the</span> <span class="n">target</span> <span class="nc">Hudi</span> <span class="n">dataset</span><span class="o">.</span> <span class="o">(</span><span class="nc">Will</span> <span class="n">be</span> <span class="n">created</span> <span class="k">if</span> <span class="n">did</span> <span class="n">not</span>
+      <span class="n">exist</span> <span class="n">first</span> <span class="n">time</span> <span class="n">around</span><span class="o">.</span> <span class="nc">If</span> <span class="n">exists</span><span class="o">,</span> <span class="n">expected</span> <span class="n">to</span> <span class="n">be</span> <span class="n">a</span> <span class="nc">Hudi</span> <span class="n">dataset</span><span class="o">)</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">target</span><span class="o">-</span><span class="n">table</span>
+      <span class="n">name</span> <span class="n">of</span> <span class="n">the</span> <span class="n">target</span> <span class="n">table</span> <span class="n">in</span> <span class="nc">Hive</span>
+    <span class="o">--</span><span class="n">transformer</span><span class="o">-</span><span class="kd">class</span>
+      <span class="nc">subclass</span> <span class="n">of</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">transform</span><span class="o">.</span><span class="na">Transformer</span><span class="o">.</span> <span class="no">UDF</span> <span class="n">to</span>
+      <span class="n">transform</span> <span class="n">raw</span> <span class="n">source</span> <span class="n">dataset</span> <span class="n">to</span> <span class="n">a</span> <span class="n">target</span> <span class="nf">dataset</span> <span class="o">(</span><span class="n">conforming</span> <span class="n">to</span> <span class="n">target</span>
+      <span class="n">schema</span><span class="o">)</span> <span class="n">before</span> <span class="n">writing</span><span class="o">.</span> <span class="nc">Default</span> <span class="o">:</span> <span class="nc">Not</span> <span class="n">set</span><span class="o">.</span> <span class="nl">E:</span><span class="n">g</span> <span class="o">-</span>
+      <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">transform</span><span class="o">.</span><span class="na">SqlQueryBasedTransformer</span> <span class="o">(</span><span class="n">which</span>
+      <span class="n">allows</span> <span class="n">a</span> <span class="no">SQL</span> <span class="n">query</span> <span class="n">template</span> <span class="n">to</span> <span class="n">be</span> <span class="n">passed</span> <span class="n">as</span> <span class="n">a</span> <span class="n">transformation</span> <span class="n">function</span><span class="o">)</span>
+</code></pre></div></div>
+
+<p>该工具采用层次结构组成的属性文件,并具有可插拔的接口,用于提取数据、生成密钥和提供模式。
+从Kafka和DFS摄取数据的示例配置在这里:<code class="highlighter-rouge">hudi-utilities/src/test/resources/delta-streamer-config</code>。</p>
+
+<p>例如:当您让Confluent Kafka、Schema注册表启动并运行后,可以用这个命令产生一些测试数据
+(<a href="https://docs.confluent.io/current/ksql/docs/tutorials/generate-custom-test-data.html">impressions.avro</a>,
+由schema-registry代码库提供)</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span><span class="n">confluent</span><span class="o">-</span><span class="mf">5.0</span><span class="o">.</span><span class="mi">0</span><span class="o">]</span><span class="err">$</span> <span class="n">bin</span><span class="o">/</span><span class="n">ksql</span><span class="o">-</span><span class="n">datagen</span> <span class="n">schema</span><span class="o">=../</span> [...]
+</code></pre></div></div>
+
+<p>然后用如下命令摄取这些数据。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span><span class="n">hoodie</span><span class="o">]</span><span class="err">$</span> <span class="n">spark</span><span class="o">-</span><span class="n">submit</span> <span class="o">--</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</sp [...]
+  <span class="o">--</span><span class="n">props</span> <span class="nl">file:</span><span class="c1">//${PWD}/hudi-utilities/src/test/resources/delta-streamer-config/kafka-source.properties \</span>
+  <span class="o">--</span><span class="n">schemaprovider</span><span class="o">-</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">schema</span><span class="o">.</span><span class="na">SchemaRegistryProvider</span> <span class="err">\</span>
+  <span class="o">--</span><span class="n">source</span><span class="o">-</span><span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">hudi</span><span class="o">.</span><span class="na">utilities</span><span class="o">.</span><span class="na">sources</span><span class="o">.</span><span class="na">AvroKafkaSource</span> <span class="err">\</span>
+  <span class="o">--</span><span class="n">source</span><span class="o">-</span><span class="n">ordering</span><span class="o">-</span><span class="n">field</span> <span class="n">impresssiontime</span> <span class="err">\</span>
+  <span class="o">--</span><span class="n">target</span><span class="o">-</span><span class="n">base</span><span class="o">-</span><span class="n">path</span> <span class="nl">file:</span><span class="c1">///tmp/hudi-deltastreamer-op --target-table uber.impressions \</span>
+  <span class="o">--</span><span class="n">op</span> <span class="no">BULK_INSERT</span>
+</code></pre></div></div>
+
+<p>在某些情况下,您可能需要预先将现有数据集迁移到Hudi。 请参考<a href="/cn/docs/0.5.1-migration_guide.html">迁移指南</a>。</p>
+
+<h2 id="datasource-writer">Datasource Writer</h2>
+
+<p><code class="highlighter-rouge">hudi-spark</code>模块提供了DataSource API,可以将任何DataFrame写入(也可以读取)到Hudi数据集中。
+以下是在指定需要使用的字段名称的之后,如何插入更新DataFrame的方法,这些字段包括
+<code class="highlighter-rouge">recordKey =&gt; _row_key</code>、<code class="highlighter-rouge">partitionPath =&gt; partition</code>和<code class="highlighter-rouge">precombineKey =&gt; timestamp</code></p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">inputDF</span><span class="o">.</span><span class="na">write</span><span class="o">()</span>
+       <span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">options</span><span class="o">(</span><span class="n">clientOpts</span><span class="o">)</span> <span class="c1">// 可以传入任何Hudi客户端参数</span>
+       <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">RECORDKEY_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"_row_key"</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PARTITIONPATH_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"partition"</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PRECOMBINE_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"timestamp"</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">HoodieWriteConfig</span><span class="o">.</span><span class="na">TABLE_NAME</span><span class="o">,</span> <span class="n">tableName</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">mode</span><span class="o">(</span><span class="nc">SaveMode</span><span class="o">.</span><span class="na">Append</span><span class="o">)</span>
+       <span class="o">.</span><span class="na">save</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+</code></pre></div></div>
+
+<h2 id="与hive同步">与Hive同步</h2>
+
+<p>上面的两个工具都支持将数据集的最新模式同步到Hive Metastore,以便查询新的列和分区。
+如果需要从命令行或在独立的JVM中运行它,Hudi提供了一个<code class="highlighter-rouge">HiveSyncTool</code>,
+在构建了hudi-hive模块之后,可以按以下方式调用它。</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">cd</span> <span class="n">hudi</span><span class="o">-</span><span class="n">hive</span>
+<span class="o">./</span><span class="n">run_sync_tool</span><span class="o">.</span><span class="na">sh</span>
+ <span class="o">[</span><span class="n">hudi</span><span class="o">-</span><span class="n">hive</span><span class="o">]</span><span class="err">$</span> <span class="o">./</span><span class="n">run_sync_tool</span><span class="o">.</span><span class="na">sh</span> <span class="o">--</span><span class="n">help</span>
+<span class="nl">Usage:</span> <span class="o">&lt;</span><span class="n">main</span> <span class="kd">class</span><span class="err">&gt;</span> <span class="err">[</span><span class="nc">options</span><span class="o">]</span>
+  <span class="nl">Options:</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">base</span><span class="o">-</span><span class="n">path</span>
+       <span class="nc">Basepath</span> <span class="n">of</span> <span class="nc">Hudi</span> <span class="n">dataset</span> <span class="n">to</span> <span class="n">sync</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">database</span>
+       <span class="n">name</span> <span class="n">of</span> <span class="n">the</span> <span class="n">target</span> <span class="n">database</span> <span class="n">in</span> <span class="nc">Hive</span>
+    <span class="o">--</span><span class="n">help</span><span class="o">,</span> <span class="o">-</span><span class="n">h</span>
+       <span class="nl">Default:</span> <span class="kc">false</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">jdbc</span><span class="o">-</span><span class="n">url</span>
+       <span class="nc">Hive</span> <span class="n">jdbc</span> <span class="n">connect</span> <span class="n">url</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">pass</span>
+       <span class="nc">Hive</span> <span class="n">password</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">table</span>
+       <span class="n">name</span> <span class="n">of</span> <span class="n">the</span> <span class="n">target</span> <span class="n">table</span> <span class="n">in</span> <span class="nc">Hive</span>
+  <span class="o">*</span> <span class="o">--</span><span class="n">user</span>
+       <span class="nc">Hive</span> <span class="n">username</span>
+</code></pre></div></div>
+
+<h2 id="删除数据">删除数据</h2>
+
+<p>通过允许用户指定不同的数据记录负载实现,Hudi支持对存储在Hudi数据集中的数据执行两种类型的删除。</p>
+
+<ul>
+  <li><strong>Soft Deletes(软删除)</strong> :使用软删除时,用户希望保留键,但仅使所有其他字段的值都为空。
+ 通过确保适当的字段在数据集模式中可以为空,并在将这些字段设置为null之后直接向数据集插入更新这些记录,即可轻松实现这一点。</li>
+  <li><strong>Hard Deletes(硬删除)</strong> :这种更强形式的删除是从数据集中彻底删除记录在存储上的任何痕迹。 
+ 这可以通过触发一个带有自定义负载实现的插入更新来实现,这种实现可以使用总是返回Optional.Empty作为组合值的DataSource或DeltaStreamer。 
+ Hudi附带了一个内置的<code class="highlighter-rouge">org.apache.hudi.EmptyHoodieRecordPayload</code>类,它就是实现了这一功能。</li>
+</ul>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">deleteDF</span> <span class="c1">// 仅包含要删除的记录的DataFrame</span>
+   <span class="o">.</span><span class="na">write</span><span class="o">().</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">)</span>
+   <span class="o">.</span><span class="na">option</span><span class="o">(...)</span> <span class="c1">// 根据设置需要添加HUDI参数,例如记录键、分区路径和其他参数</span>
+   <span class="c1">// 指定record_key,partition_key,precombine_fieldkey和常规参数</span>
+   <span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PAYLOAD_CLASS_OPT_KEY</span><span class="o">,</span> <span class="s">"org.apache.hudi.EmptyHoodieRecordPayload"</span><span class="o">)</span>
+ 
+</code></pre></div></div>
+
+<h2 id="存储管理">存储管理</h2>
+
+<p>Hudi还对存储在Hudi数据集中的数据执行几个关键的存储管理功能。在DFS上存储数据的关键方面是管理文件大小和数量以及回收存储空间。 
+例如,HDFS在处理小文件上性能很差,这会对Name Node的内存及RPC施加很大的压力,并可能破坏整个集群的稳定性。
+通常,查询引擎可在较大的列文件上提供更好的性能,因为它们可以有效地摊销获得列统计信息等的成本。
+即使在某些云数据存储上,列出具有大量小文件的目录也常常比较慢。</p>
+
+<p>以下是一些有效管理Hudi数据集存储的方法。</p>
+
+<ul>
+  <li>Hudi中的<a href="/cn/docs/0.5.1-configurations.html#compactionSmallFileSize">小文件处理功能</a>,可以分析传入的工作负载并将插入内容分配到现有文件组中,
+ 而不是创建新文件组。新文件组会生成小文件。</li>
+  <li>可以<a href="/cn/docs/0.5.1-configurations.html#retainCommits">配置</a>Cleaner来清理较旧的文件片,清理的程度可以调整,
+ 具体取决于查询所需的最长时间和增量拉取所需的回溯。</li>
+  <li>用户还可以调整<a href="/cn/docs/0.5.1-configurations.html#limitFileSize">基础/parquet文件</a>、<a href="/cn/docs/0.5.1-configurations.html#logFileMaxSize">日志文件</a>的大小
+ 和预期的<a href="/cn/docs/0.5.1-configurations.html#parquetCompressionRatio">压缩率</a>,使足够数量的插入被分到同一个文件组中,最终产生大小合适的基础文件。</li>
+  <li>智能调整<a href="/cn/docs/0.5.1-configurations.html#withBulkInsertParallelism">批插入并行度</a>,可以产生大小合适的初始文件组。
+ 实际上,正确执行此操作非常关键,因为文件组一旦创建后就不能删除,只能如前所述对其进行扩展。</li>
+  <li>对于具有大量更新的工作负载,<a href="/cn/docs/0.5.1-concepts.html#merge-on-read-storage">读取时合并存储</a>提供了一种很好的机制,
+ 可以快速将其摄取到较小的文件中,之后通过压缩将它们合并为较大的基础文件。</li>
+</ul>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/cn/docs/docs-versions.html b/content/cn/docs/docs-versions.html
index 37414e0..675bb99 100644
--- a/content/cn/docs/docs-versions.html
+++ b/content/cn/docs/docs-versions.html
@@ -4,7 +4,7 @@
     <meta charset="utf-8">
 
 <!-- begin _includes/seo.html --><title>文档版本 - Apache Hudi</title>
-<meta name="description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+<meta name="description" content="                              Latest            英文版            中文版                                  0.5.1            英文版            中文版                                  0.5.0            英文版            中文版                  ">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
@@ -13,7 +13,7 @@
 <meta property="og:url" content="https://hudi.apache.org/cn/docs/docs-versions.html">
 
 
-  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.0            英文版            中文版                  ">
+  <meta property="og:description" content="                              Latest            英文版            中文版                                  0.5.1            英文版            中文版                                  0.5.0            英文版            中文版                  ">
 
 
 
@@ -347,6 +347,12 @@
         </tr>
       
         <tr>
+            <th>0.5.1</th>
+            <td><a href="/docs/0.5.1-quick-start-guide.html">英文版</a></td>
+            <td><a href="/cn/docs/0.5.1-quick-start-guide.html">中文版</a></td>
+        </tr>
+      
+        <tr>
             <th>0.5.0</th>
             <td><a href="/docs/0.5.0-quick-start-guide.html">英文版</a></td>
             <td><a href="/cn/docs/0.5.0-quick-start-guide.html">中文版</a></td>
diff --git a/content/cn/docs/querying_data.html b/content/cn/docs/querying_data.html
index b9b9147..835ab9e 100644
--- a/content/cn/docs/querying_data.html
+++ b/content/cn/docs/querying_data.html
@@ -350,6 +350,11 @@
     </ul>
   </li>
   <li><a href="#presto">Presto</a></li>
+  <li><a href="#impala此功能还未正式发布">Impala(此功能还未正式发布)</a>
+    <ul>
+      <li><a href="#读优化表">读优化表</a></li>
+    </ul>
+  </li>
 </ul>
           </nav>
         </aside>
@@ -572,6 +577,33 @@ Upsert实用程序(<code class="highlighter-rouge">HoodieDeltaStreamer</code>
 <p>Presto是一种常用的查询引擎,可提供交互式查询性能。 Hudi RO表可以在Presto中无缝查询。
 这需要在整个安装过程中将<code class="highlighter-rouge">hudi-presto-bundle</code> jar放入<code class="highlighter-rouge">&lt;presto_install&gt;/plugin/hive-hadoop2/</code>中。</p>
 
+<h2 id="impala此功能还未正式发布">Impala(此功能还未正式发布)</h2>
+
+<h3 id="读优化表">读优化表</h3>
+
+<p>Impala可以在HDFS上查询Hudi读优化表,作为一种 <a href="https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_tables.html#external_tables">EXTERNAL TABLE</a> 的形式。<br />
+可以通过以下方式在Impala上建立Hudi读优化表:</p>
+<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE EXTERNAL TABLE database.table_name
+LIKE PARQUET '/path/to/load/xxx.parquet'
+STORED AS HUDIPARQUET
+LOCATION '/path/to/load';
+</code></pre></div></div>
+<p>Impala可以利用合理的文件分区来提高查询的效率。
+如果想要建立分区的表,文件夹命名需要根据此种方式<code class="highlighter-rouge">year=2020/month=1</code>.
+Impala使用<code class="highlighter-rouge">=</code>来区分分区名和分区值.<br />
+可以通过以下方式在Impala上建立分区Hudi读优化表:</p>
+<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE EXTERNAL TABLE database.table_name
+LIKE PARQUET '/path/to/load/xxx.parquet'
+PARTITION BY (year int, month int, day int)
+STORED AS HUDIPARQUET
+LOCATION '/path/to/load';
+ALTER TABLE database.table_name RECOVER PARTITIONS;
+</code></pre></div></div>
+<p>在Hudi成功写入一个新的提交后, 刷新Impala表来得到最新的结果.</p>
+<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>REFRESH database.table_name
+</code></pre></div></div>
+
+
       </section>
 
       <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
diff --git a/content/cn/releases.html b/content/cn/releases.html
index 140f11b..70cca31 100644
--- a/content/cn/releases.html
+++ b/content/cn/releases.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
diff --git a/content/roadmap.html b/content/cn/security.html
similarity index 70%
copy from content/roadmap.html
copy to content/cn/security.html
index 306fd7a..17a7f46 100644
--- a/content/roadmap.html
+++ b/content/cn/security.html
@@ -3,14 +3,14 @@
   <head>
     <meta charset="utf-8">
 
-<!-- begin _includes/seo.html --><title>Roadmap - Apache Hudi</title>
+<!-- begin _includes/seo.html --><title>Security - Apache Hudi</title>
 <meta name="description" content="Apache Hudi ingests &amp; manages storage of large analytical datasets over DFS (HDFS or cloud stores).">
 
 <meta property="og:type" content="website">
 <meta property="og:locale" content="en_US">
 <meta property="og:site_name" content="">
-<meta property="og:title" content="Roadmap">
-<meta property="og:url" content="https://hudi.apache.org/roadmap">
+<meta property="og:title" content="Security">
+<meta property="og:url" content="https://hudi.apache.org/cn/security">
 
 
 
@@ -81,15 +81,15 @@
           
         </a>
         <ul class="visible-links"><li class="masthead__menu-item">
-              <a href="/docs/quick-start-guide.html" target="_self" >Documentation</a>
+              <a href="/cn/docs/quick-start-guide.html" target="_self" >文档</a>
             </li><li class="masthead__menu-item">
-              <a href="/community.html" target="_self" >Community</a>
+              <a href="/cn/community.html" target="_self" >社区</a>
             </li><li class="masthead__menu-item">
-              <a href="/activity.html" target="_self" >Activities</a>
+              <a href="/cn/activity.html" target="_self" >动态</a>
             </li><li class="masthead__menu-item">
               <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
             </li><li class="masthead__menu-item">
-              <a href="/releases.html" target="_self" >Releases</a>
+              <a href="/cn/releases.html" target="_self" >发布</a>
             </li></ul>
         <button class="greedy-nav__toggle hidden" type="button">
           <span class="visually-hidden">Toggle menu</span>
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
@@ -170,39 +174,46 @@
     <div class="page__inner-wrap">
       
         <header>
-          <h1 id="page-title" class="page__title" itemprop="headline">Roadmap
+          <h1 id="page-title" class="page__title" itemprop="headline">Security
 </h1>
         </header>
       
 
       <section class="page__content" itemprop="text">
         
-          <style>
-            .page {
-              padding-right: 0 !important;
-            }
-          </style>
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#reporting-security-issues">Reporting Security Issues</a></li>
+  <li><a href="#reporting-vulnerability">Reporting Vulnerability</a></li>
+  <li><a href="#vulnerability-handling">Vulnerability Handling</a></li>
+</ul>
+          </nav>
+        </aside>
         
-        <p>Below is a depiction of what’s to come and how its sequenced.</p>
+        <h2 id="reporting-security-issues">Reporting Security Issues</h2>
 
-<h2 id="big-picture">Big Picture</h2>
+<p>The Apache Software Foundation takes a rigorous standpoint in annihilating the security issues in its software projects. Apache Hudi is highly sensitive and forthcoming to issues pertaining to its features and functionality.</p>
 
-<ul>
-  <li>Fills a clear void in data ingestion, storage and processing!</li>
-  <li>Leads the convergence towards streaming style processing!</li>
-  <li>Brings transactional semantics to managing data</li>
-  <li>Positioned to solve impending demand for scale &amp; speed</li>
-  <li>Evolve as “de facto”, open, vendor neutral standard for data storage!</li>
-</ul>
+<h2 id="reporting-vulnerability">Reporting Vulnerability</h2>
+
+<p>If you have apprehensions regarding Hudi’s security or you discover vulnerability or potential threat, don’t hesitate to get in touch with the <a href="http://www.apache.org/security/">Apache Security Team</a> by dropping a mail at <a href="security@apache.org">security@apache.org</a>. In the mail, specify the description of the issue or potential threat. You are also urged to recommend the way to reproduce and replicate the issue. The Hudi community will get back to you after assessi [...]
+
+<p><strong>PLEASE PAY ATTENTION</strong> to report the security issue on the security email before disclosing it on public domain.</p>
 
-<h2 id="roadmap">Roadmap</h2>
+<h2 id="vulnerability-handling">Vulnerability Handling</h2>
 
-<p class="notice--info"><strong>ProTip:</strong> This is a rough roadmap (non exhaustive list) of what’s to come in each of the areas for Hudi.</p>
+<p>An overview of the vulnerability handling process is:</p>
 
-<figure>
-  <img src="/assets/images/roadmap.png" alt="bundle install in Terminal window" />
-</figure>
+<ul>
+  <li>The reporter reports the vulnerability privately to Apache.</li>
+  <li>The appropriate project’s security team works privately with the reporter to resolve the vulnerability.</li>
+  <li>A new release of the Apache product concerned is made that includes the fix.</li>
+  <li>The vulnerability is publically announced.</li>
+</ul>
 
+<p>A more detailed description of the process can be found <a href="https://www.apache.org/security/committers.html">here</a>.</p>
 
       </section>
 
diff --git a/content/community.html b/content/community.html
index 728cfce..83983f8 100644
--- a/content/community.html
+++ b/content/community.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
@@ -304,12 +308,6 @@ Committers are chosen by a majority vote of the Apache Hudi <a href="https://www
       <td>kishoreg</td>
     </tr>
     <tr>
-      <td><img src="https://avatars.githubusercontent.com/leesf" style="max-width: 100px" alt="leesf" align="middle" /></td>
-      <td><a href="https://github.com/leesf">Shaofeng Li</a></td>
-      <td>Committer</td>
-      <td>leesf</td>
-    </tr>
-    <tr>
       <td><img src="https://avatars.githubusercontent.com/lresende" style="max-width: 100px" alt="lresende" align="middle" /></td>
       <td><a href="https://github.com/lresende">Luciano Resende</a></td>
       <td>PPMC, Committer</td>
@@ -328,6 +326,18 @@ Committers are chosen by a majority vote of the Apache Hudi <a href="https://www
       <td>prasanna</td>
     </tr>
     <tr>
+      <td><img src="https://avatars.githubusercontent.com/leesf" style="max-width: 100px" alt="leesf" align="middle" /></td>
+      <td><a href="https://github.com/leesf">Shaofeng Li</a></td>
+      <td>PPMC, Committer</td>
+      <td>leesf</td>
+    </tr>
+    <tr>
+      <td><img src="https://avatars.githubusercontent.com/nsivabalan" style="max-width: 100px" alt="nsivabalan" align="middle" /></td>
+      <td><a href="https://github.com/nsivabalan">Sivabalan Narayanan</a></td>
+      <td>Committer</td>
+      <td>sivabalan</td>
+    </tr>
+    <tr>
       <td><img src="https://avatars.githubusercontent.com/smarthi" style="max-width: 100px" alt="smarthi" align="middle" /></td>
       <td><a href="https://github.com/smarthi">Suneel Marthi</a></td>
       <td>PPMC, Committer</td>
@@ -348,7 +358,7 @@ Committers are chosen by a majority vote of the Apache Hudi <a href="https://www
     <tr>
       <td><img src="https://avatars.githubusercontent.com/yanghua" style="max-width: 100px" alt="yanghua" /></td>
       <td><a href="https://github.com/yanghua">vinoyang</a></td>
-      <td>Committer</td>
+      <td>PPMC, Committer</td>
       <td>vinoyang</td>
     </tr>
     <tr>
diff --git a/content/contributing.html b/content/contributing.html
index 6b9c264..cb1fc65 100644
--- a/content/contributing.html
+++ b/content/contributing.html
@@ -154,6 +154,10 @@
 
           
         
+          <li><a href="/security" target="_self" rel="nofollow noopener noreferrer"><i class="fa fa-navicon" aria-hidden="true"></i> Report Security Issues</a></li>
+
+          
+        
       
     </ul>
   </div>
@@ -307,7 +311,14 @@ so that the community can contribute at large and help implement it much quickly
   <li>Once you finalize on a project/task, please open a new JIRA or assign an existing one to yourself.
     <ul>
       <li>Almost all PRs should be linked to a JIRA. It’s always good to have a JIRA upfront to avoid duplicating efforts.</li>
-      <li>If the changes are minor, then <code class="highlighter-rouge">[MINOR]</code> prefix can be added to Pull Request title without a JIRA.</li>
+      <li>If the changes are minor, then <code class="highlighter-rouge">[MINOR]</code> prefix can be added to Pull Request title without a JIRA. Below are some tips to judge <strong>MINOR</strong> Pull Request :
+        <ul>
+          <li>trivial fixes (for example, a typo, a broken link, intellisense or an obvious error)</li>
+          <li>the change does not alter functionality or performance in any way</li>
+          <li>changed lines less than 100</li>
+          <li>obviously judge that the PR would pass without waiting for CI / CD verification</li>
+        </ul>
+      </li>
       <li>But, you may be asked to file a JIRA, if reviewer deems it necessary</li>
     </ul>
   </li>
diff --git a/content/docs/0.5.0-docs-versions.html b/content/docs/0.5.0-docs-versions.html
index 00b3f11..b4c6136 100644
--- a/content/docs/0.5.0-docs-versions.html
+++ b/content/docs/0.5.0-docs-versions.html
@@ -4,7 +4,7 @@
     <meta charset="utf-8">
 
 <!-- begin _includes/seo.html --><title>Docs Versions - Apache Hudi</title>
-<meta name="description" content="                              Latest            English Version            Chinese Version                                  0.5.0            English Version            Chinese Version                  ">
+<meta name="description" content="                              Latest            English Version            Chinese Version                                  0.5.1            English Version            Chinese Version                                  0.5.0            English Version            Chinese Version                  ">
 
 <meta property="og:type" content="article">
 <meta property="og:locale" content="en_US">
@@ -13,7 +13,7 @@
 <meta property="og:url" content="https://hudi.apache.org/docs/0.5.0-docs-versions.html">
 
 
-  <meta property="og:description" content="                              Latest            English Version            Chinese Version                                  0.5.0            English Version            Chinese Version                  ">
+  <meta property="og:description" content="                              Latest            English Version            Chinese Version                                  0.5.1            English Version            Chinese Version                                  0.5.0            English Version            Chinese Version                  ">
 
 
 
@@ -347,6 +347,12 @@
         </tr>
       
         <tr>
+            <th>0.5.1</th>
+            <td><a href="/docs/0.5.1-quick-start-guide.html">English Version</a></td>
+            <td><a href="/cn/docs/0.5.1-quick-start-guide.html">Chinese Version</a></td>
+        </tr>
+      
+        <tr>
             <th>0.5.0</th>
             <td><a href="/docs/0.5.0-quick-start-guide.html">English Version</a></td>
             <td><a href="/cn/docs/0.5.0-quick-start-guide.html">Chinese Version</a></td>
diff --git a/content/docs/0.5.1-comparison.html b/content/docs/0.5.1-comparison.html
new file mode 100644
index 0000000..dc4adc9
--- /dev/null
+++ b/content/docs/0.5.1-comparison.html
@@ -0,0 +1,433 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>Comparison - Apache Hudi</title>
+<meta name="description" content="Apache Hudi fills a big void for processing data on top of DFS, and thus mostly co-exists nicely with these technologies. However,it would be useful to understand how Hudi fits into the current big data ecosystem, contrasting it with a few related systemsand bring out the different tradeoffs these systems have accepted in their design.">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="Comparison">
+<meta property="og:url" content="https://hudi.apache.org/docs/0.5.1-comparison.html">
+
+
+  <meta property="og:description" content="Apache Hudi fills a big void for processing data on top of DFS, and thus mostly co-exists nicely with these technologies. However,it would be useful to understand how Hudi fits into the current big data ecosystem, contrasting it with a few related systemsand bring out the different tradeoffs these systems have accepted in their design.">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/docs/quick-start-guide.html" target="_self" >Documentation</a>
+            </li><li class="masthead__menu-item">
+              <a href="/community.html" target="_self" >Community</a>
+            </li><li class="masthead__menu-item">
+              <a href="/activity.html" target="_self" >Activities</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/releases.html" target="_self" >Releases</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">Toggle Menu</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">Getting Started</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-quick-start-guide.html" class="">Quick Start</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-use_cases.html" class="">Use Cases</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-powered_by.html" class="">Talks & Powered By</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-comparison.html" class="active">Comparison</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docker_demo.html" class="">Docker Demo</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">Documentation</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-concepts.html" class="">Concepts</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-writing_data.html" class="">Writing Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-querying_data.html" class="">Querying Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-configurations.html" class="">Configuration</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-performance.html" class="">Performance</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-deployment.html" class="">Deployment</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">INFO</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docs-versions.html" class="">Docs Versions</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-privacy.html" class="">Privacy Policy</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">Comparison
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+          <style>
+            .page {
+              padding-right: 0 !important;
+            }
+          </style>
+        
+        <p>Apache Hudi fills a big void for processing data on top of DFS, and thus mostly co-exists nicely with these technologies. However,
+it would be useful to understand how Hudi fits into the current big data ecosystem, contrasting it with a few related systems
+and bring out the different tradeoffs these systems have accepted in their design.</p>
+
+<h2 id="kudu">Kudu</h2>
+
+<p><a href="https://kudu.apache.org">Apache Kudu</a> is a storage system that has similar goals as Hudi, which is to bring real-time analytics on petabytes of data via first
+class support for <code class="highlighter-rouge">upserts</code>. A key differentiator is that Kudu also attempts to serve as a datastore for OLTP workloads, something that Hudi does not aspire to be.
+Consequently, Kudu does not support incremental pulling (as of early 2017), something Hudi does to enable incremental processing use cases.</p>
+
+<p>Kudu diverges from a distributed file system abstraction and HDFS altogether, with its own set of storage servers talking to each  other via RAFT.
+Hudi, on the other hand, is designed to work with an underlying Hadoop compatible filesystem (HDFS,S3 or Ceph) and does not have its own fleet of storage servers,
+instead relying on Apache Spark to do the heavy-lifting. Thu, Hudi can be scaled easily, just like other Spark jobs, while Kudu would require hardware
+&amp; operational support, typical to datastores like HBase or Vertica. We have not at this point, done any head to head benchmarks against Kudu (given RTTable is WIP).
+But, if we were to go with results shared by <a href="https://db-blog.web.cern.ch/blog/zbigniew-baranowski/2017-01-performance-comparison-different-file-formats-and-storage-engines">CERN</a> ,
+we expect Hudi to positioned at something that ingests parquet with superior performance.</p>
+
+<h2 id="hive-transactions">Hive Transactions</h2>
+
+<p><a href="https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions">Hive Transactions/ACID</a> is another similar effort, which tries to implement storage like
+<code class="highlighter-rouge">merge-on-read</code>, on top of ORC file format. Understandably, this feature is heavily tied to Hive and other efforts like <a href="https://cwiki.apache.org/confluence/display/Hive/LLAP">LLAP</a>.
+Hive transactions does not offer the read-optimized storage option or the incremental pulling, that Hudi does. In terms of implementation choices, Hudi leverages
+the full power of a processing framework like Spark, while Hive transactions feature is implemented underneath by Hive tasks/queries kicked off by user or the Hive metastore.
+Based on our production experience, embedding Hudi as a library into existing Spark pipelines was much easier and less operationally heavy, compared with the other approach.
+Hudi is also designed to work with non-hive enginers like Presto/Spark and will incorporate file formats other than parquet over time.</p>
+
+<h2 id="hbase">HBase</h2>
+
+<p>Even though <a href="https://hbase.apache.org">HBase</a> is ultimately a key-value store for OLTP workloads, users often tend to associate HBase with analytics given the proximity to Hadoop.
+Given HBase is heavily write-optimized, it supports sub-second upserts out-of-box and Hive-on-HBase lets users query that data. However, in terms of actual performance for analytical workloads,
+hybrid columnar storage formats like Parquet/ORC handily beat HBase, since these workloads are predominantly read-heavy. Hudi bridges this gap between faster data and having
+analytical storage formats. From an operational perspective, arming users with a library that provides faster data, is more scalable, than managing a big farm of HBase region servers,
+just for analytics. Finally, HBase does not support incremental processing primitives like <code class="highlighter-rouge">commit times</code>, <code class="highlighter-rouge">incremental pull</code> as first class citizens like Hudi.</p>
+
+<h2 id="stream-processing">Stream Processing</h2>
+
+<p>A popular question, we get is : “How does Hudi relate to stream processing systems?”, which we will try to answer here. Simply put, Hudi can integrate with
+batch (<code class="highlighter-rouge">copy-on-write table</code>) and streaming (<code class="highlighter-rouge">merge-on-read table</code>) jobs of today, to store the computed results in Hadoop. For Spark apps, this can happen via direct
+integration of Hudi library with Spark/Spark streaming DAGs. In case of Non-Spark processing systems (eg: Flink, Hive), the processing can be done in the respective systems
+and later sent into a Hudi table via a Kafka topic/DFS intermediate file. In more conceptual level, data processing
+pipelines just consist of three components : <code class="highlighter-rouge">source</code>, <code class="highlighter-rouge">processing</code>, <code class="highlighter-rouge">sink</code>, with users ultimately running queries against the sink to use the results of the pipeline.
+Hudi can act as either a source or sink, that stores data on DFS. Applicability of Hudi to a given stream processing pipeline ultimately boils down to suitability
+of Presto/SparkSQL/Hive for your queries.</p>
+
+<p>More advanced use cases revolve around the concepts of <a href="https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop">incremental processing</a>, which effectively
+uses Hudi even inside the <code class="highlighter-rouge">processing</code> engine to speed up typical batch pipelines. For e.g: Hudi can be used as a state store inside a processing DAG (similar
+to how <a href="https://ci.apache.org/projects/flink/flink-docs-release-1.2/ops/state_backends.html#the-rocksdbstatebackend">rocksDB</a> is used by Flink). This is an item on the roadmap
+and will eventually happen as a <a href="https://issues.apache.org/jira/browse/HUDI-60">Beam Runner</a></p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/docs/0.5.1-concepts.html b/content/docs/0.5.1-concepts.html
new file mode 100644
index 0000000..73c210f
--- /dev/null
+++ b/content/docs/0.5.1-concepts.html
@@ -0,0 +1,627 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>Concepts - Apache Hudi</title>
+<meta name="description" content="Apache Hudi (pronounced “Hudi”) provides the following streaming primitives over hadoop compatible storages">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="Concepts">
+<meta property="og:url" content="https://hudi.apache.org/docs/0.5.1-concepts.html">
+
+
+  <meta property="og:description" content="Apache Hudi (pronounced “Hudi”) provides the following streaming primitives over hadoop compatible storages">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/docs/quick-start-guide.html" target="_self" >Documentation</a>
+            </li><li class="masthead__menu-item">
+              <a href="/community.html" target="_self" >Community</a>
+            </li><li class="masthead__menu-item">
+              <a href="/activity.html" target="_self" >Activities</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/releases.html" target="_self" >Releases</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">Toggle Menu</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">Getting Started</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-quick-start-guide.html" class="">Quick Start</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-use_cases.html" class="">Use Cases</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-powered_by.html" class="">Talks & Powered By</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-comparison.html" class="">Comparison</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docker_demo.html" class="">Docker Demo</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">Documentation</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-concepts.html" class="active">Concepts</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-writing_data.html" class="">Writing Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-querying_data.html" class="">Querying Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-configurations.html" class="">Configuration</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-performance.html" class="">Performance</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-deployment.html" class="">Deployment</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">INFO</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docs-versions.html" class="">Docs Versions</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-privacy.html" class="">Privacy Policy</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">Concepts
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#timeline">Timeline</a></li>
+  <li><a href="#file-management">File management</a></li>
+  <li><a href="#index">Index</a></li>
+  <li><a href="#table-types--queries">Table Types &amp; Queries</a>
+    <ul>
+      <li><a href="#table-types">Table Types</a></li>
+      <li><a href="#query-types">Query types</a></li>
+    </ul>
+  </li>
+  <li><a href="#copy-on-write-table">Copy On Write Table</a></li>
+  <li><a href="#merge-on-read-table">Merge On Read Table</a></li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>Apache Hudi (pronounced “Hudi”) provides the following streaming primitives over hadoop compatible storages</p>
+
+<ul>
+  <li>Update/Delete Records      (how do I change records in a table?)</li>
+  <li>Change Streams             (how do I fetch records that changed?)</li>
+</ul>
+
+<p>In this section, we will discuss key concepts &amp; terminologies that are important to understand, to be able to effectively use these primitives.</p>
+
+<h2 id="timeline">Timeline</h2>
+<p>At its core, Hudi maintains a <code class="highlighter-rouge">timeline</code> of all actions performed on the table at different <code class="highlighter-rouge">instants</code> of time that helps provide instantaneous views of the table,
+while also efficiently supporting retrieval of data in the order of arrival. A Hudi instant consists of the following components</p>
+
+<ul>
+  <li><code class="highlighter-rouge">Instant action</code> : Type of action performed on the table</li>
+  <li><code class="highlighter-rouge">Instant time</code> : Instant time is typically a timestamp (e.g: 20190117010349), which monotonically increases in the order of action’s begin time.</li>
+  <li><code class="highlighter-rouge">state</code> : current state of the instant</li>
+</ul>
+
+<p>Hudi guarantees that the actions performed on the timeline are atomic &amp; timeline consistent based on the instant time.</p>
+
+<p>Key actions performed include</p>
+
+<ul>
+  <li><code class="highlighter-rouge">COMMITS</code> - A commit denotes an <strong>atomic write</strong> of a batch of records into a table.</li>
+  <li><code class="highlighter-rouge">CLEANS</code> - Background activity that gets rid of older versions of files in the table, that are no longer needed.</li>
+  <li><code class="highlighter-rouge">DELTA_COMMIT</code> - A delta commit refers to an <strong>atomic write</strong> of a batch of records into a  MergeOnRead type table, where some/all of the data could be just written to delta logs.</li>
+  <li><code class="highlighter-rouge">COMPACTION</code> - Background activity to reconcile differential data structures within Hudi e.g: moving updates from row based log files to columnar formats. Internally, compaction manifests as a special commit on the timeline</li>
+  <li><code class="highlighter-rouge">ROLLBACK</code> - Indicates that a commit/delta commit was unsuccessful &amp; rolled back, removing any partial files produced during such a write</li>
+  <li><code class="highlighter-rouge">SAVEPOINT</code> - Marks certain file groups as “saved”, such that cleaner will not delete them. It helps restore the table to a point on the timeline, in case of disaster/data recovery scenarios.</li>
+</ul>
+
+<p>Any given instant can be 
+in one of the following states</p>
+
+<ul>
+  <li><code class="highlighter-rouge">REQUESTED</code> - Denotes an action has been scheduled, but has not initiated</li>
+  <li><code class="highlighter-rouge">INFLIGHT</code> - Denotes that the action is currently being performed</li>
+  <li><code class="highlighter-rouge">COMPLETED</code> - Denotes completion of an action on the timeline</li>
+</ul>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_timeline.png" alt="hudi_timeline.png" />
+</figure>
+
+<p>Example above shows upserts happenings between 10:00 and 10:20 on a Hudi table, roughly every 5 mins, leaving commit metadata on the Hudi timeline, along
+with other background cleaning/compactions. One key observation to make is that the commit time indicates the <code class="highlighter-rouge">arrival time</code> of the data (10:20AM), while the actual data
+organization reflects the actual time or <code class="highlighter-rouge">event time</code>, the data was intended for (hourly buckets from 07:00). These are two key concepts when reasoning about tradeoffs between latency and completeness of data.</p>
+
+<p>When there is late arriving data (data intended for 9:00 arriving &gt;1 hr late at 10:20), we can see the upsert producing new data into even older time buckets/folders.
+With the help of the timeline, an incremental query attempting to get all new data that was committed successfully since 10:00 hours, is able to very efficiently consume
+only the changed files without say scanning all the time buckets &gt; 07:00.</p>
+
+<h2 id="file-management">File management</h2>
+<p>Hudi organizes a table into a directory structure under a <code class="highlighter-rouge">basepath</code> on DFS. Table is broken up into partitions, which are folders containing data files for that partition,
+very similar to Hive tables. Each partition is uniquely identified by its <code class="highlighter-rouge">partitionpath</code>, which is relative to the basepath.</p>
+
+<p>Within each partition, files are organized into <code class="highlighter-rouge">file groups</code>, uniquely identified by a <code class="highlighter-rouge">file id</code>. Each file group contains several
+<code class="highlighter-rouge">file slices</code>, where each slice contains a base file (<code class="highlighter-rouge">*.parquet</code>) produced at a certain commit/compaction instant time,
+ along with set of log files (<code class="highlighter-rouge">*.log.*</code>) that contain inserts/updates to the base file since the base file was produced. 
+Hudi adopts a MVCC design, where compaction action merges logs and base files to produce new file slices and cleaning action gets rid of 
+unused/older file slices to reclaim space on DFS.</p>
+
+<h2 id="index">Index</h2>
+<p>Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file id, via an indexing mechanism. 
+This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file. In short, the 
+mapped file group contains all versions of a group of records.</p>
+
+<h2 id="table-types--queries">Table Types &amp; Queries</h2>
+<p>Hudi table types define how data is indexed &amp; laid out on the DFS and how the above primitives and timeline activities are implemented on top of such organization (i.e how data is written). 
+In turn, <code class="highlighter-rouge">query types</code> define how the underlying data is exposed to the queries (i.e how data is read).</p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Table Type</th>
+      <th>Supported Query types</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>Copy On Write</td>
+      <td>Snapshot Queries + Incremental Queries</td>
+    </tr>
+    <tr>
+      <td>Merge On Read</td>
+      <td>Snapshot Queries + Incremental Queries + Read Optimized Queries</td>
+    </tr>
+  </tbody>
+</table>
+
+<h3 id="table-types">Table Types</h3>
+<p>Hudi supports the following table types.</p>
+
+<ul>
+  <li><a href="#copy-on-write-table">Copy On Write</a> : Stores data using exclusively columnar file formats (e.g parquet). Updates simply version &amp; rewrite the files by performing a synchronous merge during write.</li>
+  <li><a href="#merge-on-read-table">Merge On Read</a> : Stores data using a combination of columnar (e.g parquet) + row based (e.g avro) file formats. Updates are logged to delta files &amp; later compacted to produce new versions of columnar files synchronously or asynchronously.</li>
+</ul>
+
+<p>Following table summarizes the trade-offs between these two table types</p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Trade-off</th>
+      <th>CopyOnWrite</th>
+      <th>MergeOnRead</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>Data Latency</td>
+      <td>Higher</td>
+      <td>Lower</td>
+    </tr>
+    <tr>
+      <td>Update cost (I/O)</td>
+      <td>Higher (rewrite entire parquet)</td>
+      <td>Lower (append to delta log)</td>
+    </tr>
+    <tr>
+      <td>Parquet File Size</td>
+      <td>Smaller (high update(I/0) cost)</td>
+      <td>Larger (low update cost)</td>
+    </tr>
+    <tr>
+      <td>Write Amplification</td>
+      <td>Higher</td>
+      <td>Lower (depending on compaction strategy)</td>
+    </tr>
+  </tbody>
+</table>
+
+<h3 id="query-types">Query types</h3>
+<p>Hudi supports the following query types</p>
+
+<ul>
+  <li><strong>Snapshot Queries</strong> : Queries see the latest snapshot of the table as of a given commit or compaction action. In case of merge on read table, it exposes near-real time data(few mins) by merging 
+ the base and delta files of the latest file slice on-the-fly. For copy on write table,  it provides a drop-in replacement for existing parquet tables, while providing upsert/delete and other write side features.</li>
+  <li><strong>Incremental Queries</strong> : Queries only see new data written to the table, since a given commit/compaction. This effectively provides change streams to enable incremental data pipelines.</li>
+  <li><strong>Read Optimized Queries</strong> : Queries see the latest snapshot of table as of a given commit/compaction action. Exposes only the base/columnar files in latest file slices and guarantees the 
+ same columnar query performance compared to a non-hudi columnar table.</li>
+</ul>
+
+<p>Following table summarizes the trade-offs between the different query types.</p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Trade-off</th>
+      <th>Snapshot</th>
+      <th>Read Optimized</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>Data Latency</td>
+      <td>Lower</td>
+      <td>Higher</td>
+    </tr>
+    <tr>
+      <td>Query Latency</td>
+      <td>Higher (merge base / columnar file + row based delta / log files)</td>
+      <td>Lower (raw base / columnar file performance)</td>
+    </tr>
+  </tbody>
+</table>
+
+<h2 id="copy-on-write-table">Copy On Write Table</h2>
+
+<p>File slices in Copy-On-Write table only contain the base/columnar file and each commit produces new versions of base files. 
+In other words, we implicitly compact on every commit, such that only columnar data exists. As a result, the write amplification 
+(number of bytes written for 1 byte of incoming data) is much higher, where read amplification is zero. 
+This is a much desired property for analytical workloads, which is predominantly read-heavy.</p>
+
+<p>Following illustrates how this works conceptually, when data written into copy-on-write table  and two queries running on top of it.</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_cow.png" alt="hudi_cow.png" />
+</figure>
+
+<p>As data gets written, updates to existing file groups produce a new slice for that file group stamped with the commit instant time, 
+while inserts allocate a new file group and write its first slice for that file group. These file slices and their commit instant times are color coded above.
+SQL queries running against such a table (eg: <code class="highlighter-rouge">select count(*)</code> counting the total records in that partition), first checks the timeline for the latest commit
+and filters all but latest file slices of each file group. As you can see, an old query does not see the current inflight commit’s files color coded in pink,
+but a new query starting after the commit picks up the new data. Thus queries are immune to any write failures/partial writes and only run on committed data.</p>
+
+<p>The intention of copy on write table, is to fundamentally improve how tables are managed today through</p>
+
+<ul>
+  <li>First class support for atomically updating data at file-level, instead of rewriting whole tables/partitions</li>
+  <li>Ability to incremental consume changes, as opposed to wasteful scans or fumbling with heuristics</li>
+  <li>Tight control of file sizes to keep query performance excellent (small files hurt query performance considerably).</li>
+</ul>
+
+<h2 id="merge-on-read-table">Merge On Read Table</h2>
+
+<p>Merge on read table is a superset of copy on write, in the sense it still supports read optimized queries of the table by exposing only the base/columnar files in latest file slices.
+Additionally, it stores incoming upserts for each file group, onto a row based delta log, to support snapshot queries by applying the delta log, 
+onto the latest version of each file id on-the-fly during query time. Thus, this table type attempts to balance read and write amplification intelligently, to provide near real-time data.
+The most significant change here, would be to the compactor, which now carefully chooses which delta log files need to be compacted onto
+their columnar base file, to keep the query performance in check (larger delta log files would incur longer merge times with merge data on query side)</p>
+
+<p>Following illustrates how the table works, and shows two types of queries - snapshot query and read optimized query.</p>
+
+<figure>
+    <img class="docimage" src="/assets/images/hudi_mor.png" alt="hudi_mor.png" style="max-width: 100%" />
+</figure>
+
+<p>There are lot of interesting things happening in this example, which bring out the subtleties in the approach.</p>
+
+<ul>
+  <li>We now have commits every 1 minute or so, something we could not do in the other table type.</li>
+  <li>Within each file id group, now there is an delta log file, which holds incoming updates to records in the base columnar files. In the example, the delta log files hold
+ all the data from 10:05 to 10:10. The base columnar files are still versioned with the commit, as before.
+ Thus, if one were to simply look at base files alone, then the table layout looks exactly like a copy on write table.</li>
+  <li>A periodic compaction process reconciles these changes from the delta log and produces a new version of base file, just like what happened at 10:05 in the example.</li>
+  <li>There are two ways of querying the same underlying table: Read Optimized query and Snapshot query, depending on whether we chose query performance or freshness of data.</li>
+  <li>The semantics around when data from a commit is available to a query changes in a subtle way for a read optimized query. Note, that such a query
+ running at 10:10, wont see data after 10:05 above, while a snapshot query always sees the freshest data.</li>
+  <li>When we trigger compaction &amp; what it decides to compact hold all the key to solving these hard problems. By implementing a compacting
+ strategy, where we aggressively compact the latest partitions compared to older partitions, we could ensure the read optimized queries see data
+ published within X minutes in a consistent fashion.</li>
+</ul>
+
+<p>The intention of merge on read table is to enable near real-time processing directly on top of DFS, as opposed to copying
+data out to specialized systems, which may not be able to handle the data volume. There are also a few secondary side benefits to 
+this table such as reduced write amplification by avoiding synchronous merge of data, i.e, the amount of data written per 1 bytes of data in a batch</p>
+
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/docs/0.5.1-configurations.html b/content/docs/0.5.1-configurations.html
new file mode 100644
index 0000000..de5ab60
--- /dev/null
+++ b/content/docs/0.5.1-configurations.html
@@ -0,0 +1,824 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>Configurations - Apache Hudi</title>
+<meta name="description" content="This page covers the different ways of configuring your job to write/read Hudi tables. At a high level, you can control behaviour at few levels.">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="Configurations">
+<meta property="og:url" content="https://hudi.apache.org/docs/0.5.1-configurations.html">
+
+
+  <meta property="og:description" content="This page covers the different ways of configuring your job to write/read Hudi tables. At a high level, you can control behaviour at few levels.">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/docs/quick-start-guide.html" target="_self" >Documentation</a>
+            </li><li class="masthead__menu-item">
+              <a href="/community.html" target="_self" >Community</a>
+            </li><li class="masthead__menu-item">
+              <a href="/activity.html" target="_self" >Activities</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/releases.html" target="_self" >Releases</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">Toggle Menu</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">Getting Started</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-quick-start-guide.html" class="">Quick Start</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-use_cases.html" class="">Use Cases</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-powered_by.html" class="">Talks & Powered By</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-comparison.html" class="">Comparison</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docker_demo.html" class="">Docker Demo</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">Documentation</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-concepts.html" class="">Concepts</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-writing_data.html" class="">Writing Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-querying_data.html" class="">Querying Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-configurations.html" class="active">Configuration</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-performance.html" class="">Performance</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-deployment.html" class="">Deployment</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">INFO</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docs-versions.html" class="">Docs Versions</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-privacy.html" class="">Privacy Policy</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+  </ul>
+</nav>
+    
+
+  
+  </div>
+
+
+  <article class="page" itemscope itemtype="https://schema.org/CreativeWork">
+
+    <div class="page__inner-wrap">
+      
+        <header>
+          <h1 id="page-title" class="page__title" itemprop="headline">Configurations
+</h1>
+        </header>
+      
+
+      <section class="page__content" itemprop="text">
+        
+        <aside class="sidebar__right sticky">
+          <nav class="toc">
+            <header><h4 class="nav__title"><i class="fas fa-file-alt"></i> IN THIS PAGE</h4></header>
+            <ul class="toc__menu">
+  <li><a href="#talking-to-cloud-storage">Talking to Cloud Storage</a></li>
+  <li><a href="#spark-datasource">Spark Datasource Configs</a>
+    <ul>
+      <li><a href="#write-options">Write Options</a></li>
+      <li><a href="#read-options">Read Options</a></li>
+    </ul>
+  </li>
+  <li><a href="#writeclient-configs">WriteClient Configs</a>
+    <ul>
+      <li><a href="#index-configs">Index configs</a></li>
+      <li><a href="#storage-configs">Storage configs</a></li>
+      <li><a href="#compaction-configs">Compaction configs</a></li>
+      <li><a href="#metrics-configs">Metrics configs</a></li>
+      <li><a href="#memory-configs">Memory configs</a></li>
+    </ul>
+  </li>
+</ul>
+          </nav>
+        </aside>
+        
+        <p>This page covers the different ways of configuring your job to write/read Hudi tables. 
+At a high level, you can control behaviour at few levels.</p>
+
+<ul>
+  <li><strong><a href="#spark-datasource">Spark Datasource Configs</a></strong> : These configs control the Hudi Spark Datasource, providing ability to define keys/partitioning, pick out the write operation, specify how to merge records or choosing query type to read.</li>
+  <li><strong><a href="#writeclient-configs">WriteClient Configs</a></strong> : Internally, the Hudi datasource uses a RDD based <code class="highlighter-rouge">HoodieWriteClient</code> api to actually perform writes to storage. These configs provide deep control over lower level aspects like 
+ file sizing, compression, parallelism, compaction, write schema, cleaning etc. Although Hudi provides sane defaults, from time-time these configs may need to be tweaked to optimize for specific workloads.</li>
+  <li><strong><a href="#PAYLOAD_CLASS_OPT_KEY">RecordPayload Config</a></strong> : This is the lowest level of customization offered by Hudi. Record payloads define how to produce new values to upsert based on incoming new record and 
+ stored old record. Hudi provides default implementations such as <code class="highlighter-rouge">OverwriteWithLatestAvroPayload</code> which simply update table with the latest/last-written record. 
+ This can be overridden to a custom class extending <code class="highlighter-rouge">HoodieRecordPayload</code> class, on both datasource and WriteClient levels.</li>
+</ul>
+
+<h2 id="talking-to-cloud-storage">Talking to Cloud Storage</h2>
+
+<p>Immaterial of whether RDD/WriteClient APIs or Datasource is used, the following information helps configure access
+to cloud stores.</p>
+
+<ul>
+  <li><a href="/docs/0.5.1-s3_hoodie">AWS S3</a> <br />
+Configurations required for S3 and Hudi co-operability.</li>
+  <li><a href="/docs/0.5.1-gcs_hoodie">Google Cloud Storage</a> <br />
+Configurations required for GCS and Hudi co-operability.</li>
+</ul>
+
+<h2 id="spark-datasource">Spark Datasource Configs</h2>
+
+<p>Spark jobs using the datasource can be configured by passing the below options into the <code class="highlighter-rouge">option(k,v)</code> method as usual.
+The actual datasource level configs are listed below.</p>
+
+<h3 id="write-options">Write Options</h3>
+
+<p>Additionally, you can pass down any of the WriteClient level configs directly using <code class="highlighter-rouge">options()</code> or <code class="highlighter-rouge">option(k,v)</code> methods.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">inputDF</span><span class="o">.</span><span class="na">write</span><span class="o">()</span>
+<span class="o">.</span><span class="na">format</span><span class="o">(</span><span class="s">"org.apache.hudi"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">options</span><span class="o">(</span><span class="n">clientOpts</span><span class="o">)</span> <span class="c1">// any of the Hudi client opts can be passed in as well</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">RECORDKEY_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"_row_key"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PARTITIONPATH_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"partition"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">DataSourceWriteOptions</span><span class="o">.</span><span class="na">PRECOMBINE_FIELD_OPT_KEY</span><span class="o">(),</span> <span class="s">"timestamp"</span><span class="o">)</span>
+<span class="o">.</span><span class="na">option</span><span class="o">(</span><span class="nc">HoodieWriteConfig</span><span class="o">.</span><span class="na">TABLE_NAME</span><span class="o">,</span> <span class="n">tableName</span><span class="o">)</span>
+<span class="o">.</span><span class="na">mode</span><span class="o">(</span><span class="nc">SaveMode</span><span class="o">.</span><span class="na">Append</span><span class="o">)</span>
+<span class="o">.</span><span class="na">save</span><span class="o">(</span><span class="n">basePath</span><span class="o">);</span>
+</code></pre></div></div>
+
+<p>Options useful for writing tables via <code class="highlighter-rouge">write.format.option(...)</code></p>
+
+<h4 id="TABLE_NAME_OPT_KEY">TABLE_NAME_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.table.name</code> [Required]<br />
+  <span style="color:grey">Hive table name, to register the table into.</span></p>
+
+<h4 id="OPERATION_OPT_KEY">OPERATION_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.operation</code>, Default: <code class="highlighter-rouge">upsert</code><br />
+  <span style="color:grey">whether to do upsert, insert or bulkinsert for the write operation. Use <code class="highlighter-rouge">bulkinsert</code> to load new data into a table, and there on use <code class="highlighter-rouge">upsert</code>/<code class="highlighter-rouge">insert</code>. 
+  bulk insert uses a disk based write path to scale to load large inputs without need to cache it.</span></p>
+
+<h4 id="TABLE_TYPE_OPT_KEY">TABLE_TYPE_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.table.type</code>, Default: <code class="highlighter-rouge">COPY_ON_WRITE</code> <br />
+  <span style="color:grey">The table type for the underlying data, for this write. This can’t change between writes.</span></p>
+
+<h4 id="PRECOMBINE_FIELD_OPT_KEY">PRECOMBINE_FIELD_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.precombine.field</code>, Default: <code class="highlighter-rouge">ts</code> <br />
+  <span style="color:grey">Field used in preCombining before actual write. When two records have the same key value,
+we will pick the one with the largest value for the precombine field, determined by Object.compareTo(..)</span></p>
+
+<h4 id="PAYLOAD_CLASS_OPT_KEY">PAYLOAD_CLASS_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.payload.class</code>, Default: <code class="highlighter-rouge">org.apache.hudi.OverwriteWithLatestAvroPayload</code> <br />
+  <span style="color:grey">Payload class used. Override this, if you like to roll your own merge logic, when upserting/inserting. 
+  This will render any value set for <code class="highlighter-rouge">PRECOMBINE_FIELD_OPT_VAL</code> in-effective</span></p>
+
+<h4 id="RECORDKEY_FIELD_OPT_KEY">RECORDKEY_FIELD_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.recordkey.field</code>, Default: <code class="highlighter-rouge">uuid</code> <br />
+  <span style="color:grey">Record key field. Value to be used as the <code class="highlighter-rouge">recordKey</code> component of <code class="highlighter-rouge">HoodieKey</code>. Actual value
+will be obtained by invoking .toString() on the field value. Nested fields can be specified using
+the dot notation eg: <code class="highlighter-rouge">a.b.c</code></span></p>
+
+<h4 id="PARTITIONPATH_FIELD_OPT_KEY">PARTITIONPATH_FIELD_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.partitionpath.field</code>, Default: <code class="highlighter-rouge">partitionpath</code> <br />
+  <span style="color:grey">Partition path field. Value to be used at the <code class="highlighter-rouge">partitionPath</code> component of <code class="highlighter-rouge">HoodieKey</code>.
+Actual value ontained by invoking .toString()</span></p>
+
+<h4 id="KEYGENERATOR_CLASS_OPT_KEY">KEYGENERATOR_CLASS_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.keygenerator.class</code>, Default: <code class="highlighter-rouge">org.apache.hudi.SimpleKeyGenerator</code> <br />
+  <span style="color:grey">Key generator class, that implements will extract the key out of incoming <code class="highlighter-rouge">Row</code> object</span></p>
+
+<h4 id="COMMIT_METADATA_KEYPREFIX_OPT_KEY">COMMIT_METADATA_KEYPREFIX_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.commitmeta.key.prefix</code>, Default: <code class="highlighter-rouge">_</code> <br />
+  <span style="color:grey">Option keys beginning with this prefix, are automatically added to the commit/deltacommit metadata.
+This is useful to store checkpointing information, in a consistent way with the hudi timeline</span></p>
+
+<h4 id="INSERT_DROP_DUPS_OPT_KEY">INSERT_DROP_DUPS_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.write.insert.drop.duplicates</code>, Default: <code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">If set to true, filters out all duplicate records from incoming dataframe, during insert operations. </span></p>
+
+<h4 id="HIVE_SYNC_ENABLED_OPT_KEY">HIVE_SYNC_ENABLED_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.enable</code>, Default: <code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">When set to true, register/sync the table to Apache Hive metastore</span></p>
+
+<h4 id="HIVE_DATABASE_OPT_KEY">HIVE_DATABASE_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.database</code>, Default: <code class="highlighter-rouge">default</code> <br />
+  <span style="color:grey">database to sync to</span></p>
+
+<h4 id="HIVE_TABLE_OPT_KEY">HIVE_TABLE_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.table</code>, [Required] <br />
+  <span style="color:grey">table to sync to</span></p>
+
+<h4 id="HIVE_USER_OPT_KEY">HIVE_USER_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.username</code>, Default: <code class="highlighter-rouge">hive</code> <br />
+  <span style="color:grey">hive user name to use</span></p>
+
+<h4 id="HIVE_PASS_OPT_KEY">HIVE_PASS_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.password</code>, Default: <code class="highlighter-rouge">hive</code> <br />
+  <span style="color:grey">hive password to use</span></p>
+
+<h4 id="HIVE_URL_OPT_KEY">HIVE_URL_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.jdbcurl</code>, Default: <code class="highlighter-rouge">jdbc:hive2://localhost:10000</code> <br />
+  <span style="color:grey">Hive metastore url</span></p>
+
+<h4 id="HIVE_PARTITION_FIELDS_OPT_KEY">HIVE_PARTITION_FIELDS_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_fields</code>, Default: ` ` <br />
+  <span style="color:grey">field in the table to use for determining hive partition columns.</span></p>
+
+<h4 id="HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY">HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.partition_extractor_class</code>, Default: <code class="highlighter-rouge">org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor</code> <br />
+  <span style="color:grey">Class used to extract partition field values into hive partition columns.</span></p>
+
+<h4 id="HIVE_ASSUME_DATE_PARTITION_OPT_KEY">HIVE_ASSUME_DATE_PARTITION_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.hive_sync.assume_date_partitioning</code>, Default: <code class="highlighter-rouge">false</code> <br />
+  <span style="color:grey">Assume partitioning is yyyy/mm/dd</span></p>
+
+<h3 id="read-options">Read Options</h3>
+
+<p>Options useful for reading tables via <code class="highlighter-rouge">read.format.option(...)</code></p>
+
+<h4 id="QUERY_TYPE_OPT_KEY">QUERY_TYPE_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.query.type</code>, Default: <code class="highlighter-rouge">snapshot</code> <br />
+<span style="color:grey">Whether data needs to be read, in incremental mode (new data since an instantTime)
+(or) Read Optimized mode (obtain latest view, based on columnar data)
+(or) Snapshot mode (obtain latest view, based on row &amp; columnar data)</span></p>
+
+<h4 id="BEGIN_INSTANTTIME_OPT_KEY">BEGIN_INSTANTTIME_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.read.begin.instanttime</code>, [Required in incremental mode] <br />
+<span style="color:grey">Instant time to start incrementally pulling data from. The instanttime here need not
+necessarily correspond to an instant on the timeline. New data written with an
+ <code class="highlighter-rouge">instant_time &gt; BEGIN_INSTANTTIME</code> are fetched out. For e.g: ‘20170901080000’ will get
+ all new data written after Sep 1, 2017 08:00AM.</span></p>
+
+<h4 id="END_INSTANTTIME_OPT_KEY">END_INSTANTTIME_OPT_KEY</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.datasource.read.end.instanttime</code>, Default: latest instant (i.e fetches all new data since begin instant time) <br />
+<span style="color:grey"> Instant time to limit incrementally fetched data to. New data written with an
+<code class="highlighter-rouge">instant_time &lt;= END_INSTANTTIME</code> are fetched out.</span></p>
+
+<h2 id="writeclient-configs">WriteClient Configs</h2>
+
+<p>Jobs programming directly against the RDD level apis can build a <code class="highlighter-rouge">HoodieWriteConfig</code> object and pass it in to the <code class="highlighter-rouge">HoodieWriteClient</code> constructor. 
+HoodieWriteConfig can be built using a builder pattern as below.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">HoodieWriteConfig</span> <span class="n">cfg</span> <span class="o">=</span> <span class="nc">HoodieWriteConfig</span><span class="o">.</span><span class="na">newBuilder</span><span class="o">()</span>
+        <span class="o">.</span><span class="na">withPath</span><span class="o">(</span><span class="n">basePath</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">forTable</span><span class="o">(</span><span class="n">tableName</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">withSchema</span><span class="o">(</span><span class="n">schemaStr</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">withProps</span><span class="o">(</span><span class="n">props</span><span class="o">)</span> <span class="c1">// pass raw k,v pairs from a property file.</span>
+        <span class="o">.</span><span class="na">withCompactionConfig</span><span class="o">(</span><span class="nc">HoodieCompactionConfig</span><span class="o">.</span><span class="na">newBuilder</span><span class="o">().</span><span class="na">withXXX</span><span class="o">(...).</span><span class="na">build</span><span class="o">())</span>
+        <span class="o">.</span><span class="na">withIndexConfig</span><span class="o">(</span><span class="nc">HoodieIndexConfig</span><span class="o">.</span><span class="na">newBuilder</span><span class="o">().</span><span class="na">withXXX</span><span class="o">(...).</span><span class="na">build</span><span class="o">())</span>
+        <span class="o">...</span>
+        <span class="o">.</span><span class="na">build</span><span class="o">();</span>
+</code></pre></div></div>
+
+<p>Following subsections go over different aspects of write configs, explaining most important configs with their property names, default values.</p>
+
+<h4 id="withPath">withPath(hoodie_base_path)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.base.path</code> [Required] <br />
+<span style="color:grey">Base DFS path under which all the data partitions are created. Always prefix it explicitly with the storage scheme (e.g hdfs://, s3:// etc). Hudi stores all the main meta-data about commits, savepoints, cleaning audit logs etc in .hoodie directory under the base directory. </span></p>
+
+<h4 id="withSchema">withSchema(schema_str)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.avro.schema</code> [Required]<br />
+<span style="color:grey">This is the current reader avro schema for the table. This is a string of the entire schema. HoodieWriteClient uses this schema to pass on to implementations of HoodieRecordPayload to convert from the source format to avro record. This is also used when re-writing records during an update. </span></p>
+
+<h4 id="forTable">forTable(table_name)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.table.name</code> [Required] <br />
+ <span style="color:grey">Table name that will be used for registering with Hive. Needs to be same across runs.</span></p>
+
+<h4 id="withBulkInsertParallelism">withBulkInsertParallelism(bulk_insert_parallelism = 1500)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bulkinsert.shuffle.parallelism</code><br />
+<span style="color:grey">Bulk insert is meant to be used for large initial imports and this parallelism determines the initial number of files in your table. Tune this to achieve a desired optimal size during initial import.</span></p>
+
+<h4 id="withParallelism">withParallelism(insert_shuffle_parallelism = 1500, upsert_shuffle_parallelism = 1500)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.insert.shuffle.parallelism</code>, <code class="highlighter-rouge">hoodie.upsert.shuffle.parallelism</code><br />
+<span style="color:grey">Once data has been initially imported, this parallelism controls initial parallelism for reading input records. Ensure this value is high enough say: 1 partition for 1 GB of input data</span></p>
+
+<h4 id="combineInput">combineInput(on_insert = false, on_update=true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.combine.before.insert</code>, <code class="highlighter-rouge">hoodie.combine.before.upsert</code><br />
+<span style="color:grey">Flag which first combines the input RDD and merges multiple partial records into a single record before inserting or updating in DFS</span></p>
+
+<h4 id="withWriteStatusStorageLevel">withWriteStatusStorageLevel(level = MEMORY_AND_DISK_SER)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.write.status.storage.level</code><br />
+<span style="color:grey">HoodieWriteClient.insert and HoodieWriteClient.upsert returns a persisted RDD[WriteStatus], this is because the Client can choose to inspect the WriteStatus and choose and commit or not based on the failures. This is a configuration for the storage level for this RDD </span></p>
+
+<h4 id="withAutoCommit">withAutoCommit(autoCommit = true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.auto.commit</code><br />
+<span style="color:grey">Should HoodieWriteClient autoCommit after insert and upsert. The client can choose to turn off auto-commit and commit on a “defined success condition”</span></p>
+
+<h4 id="withAssumeDatePartitioning">withAssumeDatePartitioning(assumeDatePartitioning = false)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.assume.date.partitioning</code><br />
+<span style="color:grey">Should HoodieWriteClient assume the data is partitioned by dates, i.e three levels from base path. This is a stop-gap to support tables created by versions &lt; 0.3.1. Will be removed eventually </span></p>
+
+<h4 id="withConsistencyCheckEnabled">withConsistencyCheckEnabled(enabled = false)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.consistency.check.enabled</code><br />
+<span style="color:grey">Should HoodieWriteClient perform additional checks to ensure written files’ are listable on the underlying filesystem/storage. Set this to true, to workaround S3’s eventual consistency model and ensure all data written as a part of a commit is faithfully available for queries. </span></p>
+
+<h3 id="index-configs">Index configs</h3>
+<p>Following configs control indexing behavior, which tags incoming records as either inserts or updates to older records.</p>
+
+<p><a href="#withIndexConfig">withIndexConfig</a> (HoodieIndexConfig) <br />
+<span style="color:grey">This is pluggable to have a external index (HBase) or use the default bloom filter stored in the Parquet files</span></p>
+
+<h4 id="withIndexType">withIndexType(indexType = BLOOM)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.type</code> <br />
+<span style="color:grey">Type of index to use. Default is Bloom filter. Possible options are [BLOOM | HBASE | INMEMORY]. Bloom filters removes the dependency on a external system and is stored in the footer of the Parquet Data Files</span></p>
+
+<h4 id="bloomFilterNumEntries">bloomFilterNumEntries(numEntries = 60000)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.bloom.num_entries</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br />This is the number of entries to be stored in the bloom filter. We assume the maxParquetFileSize is 128MB and averageRecordSize is 1024B and hence we approx a total of 130K records in a file. The default (60000) is roughly half of this approximation. <a href="https://issues.apache.org/jira/browse/HUDI-56">HUDI-56</a> tracks computing this dynamically. Warning: Setting this very low, will generate a lot of false positives [...]
+
+<h4 id="bloomFilterFPP">bloomFilterFPP(fpp = 0.000000001)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.bloom.fpp</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br /> Error rate allowed given the number of entries. This is used to calculate how many bits should be assigned for the bloom filter and the number of hash functions. This is usually set very low (default: 0.000000001), we like to tradeoff disk space for lower false positives</span></p>
+
+<h4 id="bloomIndexPruneByRanges">bloomIndexPruneByRanges(pruneRanges = true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.prune.by.ranges</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, range information from files to leveraged speed up index lookups. Particularly helpful, if the key has a monotonously increasing prefix, such as timestamp.</span></p>
+
+<h4 id="bloomIndexUseCaching">bloomIndexUseCaching(useCaching = true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.use.caching</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, the input RDD will cached to speed up index lookup by reducing IO for computing parallelism or affected partitions</span></p>
+
+<h4 id="bloomIndexTreebasedFilter">bloomIndexTreebasedFilter(useTreeFilter = true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.use.treebased.filter</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, interval tree based file pruning optimization is enabled. This mode speeds-up file-pruning based on key ranges when compared with the brute-force mode</span></p>
+
+<h4 id="bloomIndexBucketizedChecking">bloomIndexBucketizedChecking(bucketizedChecking = true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.bucketized.checking</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br /> When true, bucketized bloom filtering is enabled. This reduces skew seen in sort based bloom index lookup</span></p>
+
+<h4 id="bloomIndexKeysPerBucket">bloomIndexKeysPerBucket(keysPerBucket = 10000000)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.keys.per.bucket</code> <br />
+<span style="color:grey">Only applies if bloomIndexBucketizedChecking is enabled and index type is bloom. <br /> This configuration controls the “bucket” size which tracks the number of record-key checks made against a single file and is the unit of work allocated to each partition performing bloom filter lookup. A higher value would amortize the fixed cost of reading a bloom filter to memory. </span></p>
+
+<h4 id="bloomIndexParallelism">bloomIndexParallelism(0)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.bloom.index.parallelism</code> <br />
+<span style="color:grey">Only applies if index type is BLOOM. <br /> This is the amount of parallelism for index lookup, which involves a Spark Shuffle. By default, this is auto computed based on input workload characteristics</span></p>
+
+<h4 id="hbaseZkQuorum">hbaseZkQuorum(zkString) [Required]</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.zkquorum</code> <br />
+<span style="color:grey">Only applies if index type is HBASE. HBase ZK Quorum url to connect to.</span></p>
+
+<h4 id="hbaseZkPort">hbaseZkPort(port) [Required]</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.zkport</code> <br />
+<span style="color:grey">Only applies if index type is HBASE. HBase ZK Quorum port to connect to.</span></p>
+
+<h4 id="hbaseTableName">hbaseZkZnodeParent(zkZnodeParent)  [Required]</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.zknode.path</code> <br />
+<span style="color:grey">Only applies if index type is HBASE. This is the root znode that will contain all the znodes created/used by HBase.</span></p>
+
+<h4 id="hbaseTableName">hbaseTableName(tableName)  [Required]</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.index.hbase.table</code> <br />
+<span style="color:grey">Only applies if index type is HBASE. HBase Table name to use as the index. Hudi stores the row_key and [partition_path, fileID, commitTime] mapping in the table.</span></p>
+
+<h3 id="storage-configs">Storage configs</h3>
+<p>Controls aspects around sizing parquet and log files.</p>
+
+<p><a href="#withStorageConfig">withStorageConfig</a> (HoodieStorageConfig) <br /></p>
+
+<h4 id="limitFileSize">limitFileSize (size = 120MB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.max.file.size</code> <br />
+<span style="color:grey">Target size for parquet files produced by Hudi write phases. For DFS, this needs to be aligned with the underlying filesystem block size for optimal performance. </span></p>
+
+<h4 id="parquetBlockSize">parquetBlockSize(rowgroupsize = 120MB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.block.size</code> <br />
+<span style="color:grey">Parquet RowGroup size. Its better this is same as the file size, so that a single column within a file is stored continuously on disk</span></p>
+
+<h4 id="parquetPageSize">parquetPageSize(pagesize = 1MB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.page.size</code> <br />
+<span style="color:grey">Parquet page size. Page is the unit of read within a parquet file. Within a block, pages are compressed seperately. </span></p>
+
+<h4 id="parquetCompressionRatio">parquetCompressionRatio(parquetCompressionRatio = 0.1)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.compression.ratio</code> <br />
+<span style="color:grey">Expected compression of parquet data used by Hudi, when it tries to size new parquet files. Increase this value, if bulk_insert is producing smaller than expected sized files</span></p>
+
+<h4 id="parquetCompressionCodec">parquetCompressionCodec(parquetCompressionCodec = gzip)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
+<span style="color:grey">Parquet compression codec name. Default is gzip. Possible options are [gzip | snappy | uncompressed | lzo]</span></p>
+
+<h4 id="logFileMaxSize">logFileMaxSize(logFileSize = 1GB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.logfile.max.size</code> <br />
+<span style="color:grey">LogFile max size. This is the maximum size allowed for a log file before it is rolled over to the next version. </span></p>
+
+<h4 id="logFileDataBlockMaxSize">logFileDataBlockMaxSize(dataBlockSize = 256MB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.logfile.data.block.max.size</code> <br />
+<span style="color:grey">LogFile Data block max size. This is the maximum size allowed for a single data block to be appended to a log file. This helps to make sure the data appended to the log file is broken up into sizable blocks to prevent from OOM errors. This size should be greater than the JVM memory. </span></p>
+
+<h4 id="logFileToParquetCompressionRatio">logFileToParquetCompressionRatio(logFileToParquetCompressionRatio = 0.35)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.logfile.to.parquet.compression.ratio</code> <br />
+<span style="color:grey">Expected additional compression as records move from log files to parquet. Used for merge_on_read table to send inserts into log files &amp; control the size of compacted parquet file.</span></p>
+
+<h4 id="parquetCompressionCodec">parquetCompressionCodec(parquetCompressionCodec = gzip)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.compression.codec</code> <br />
+<span style="color:grey">Compression Codec for parquet files </span></p>
+
+<h3 id="compaction-configs">Compaction configs</h3>
+<p>Configs that control compaction (merging of log files onto a new parquet base file), cleaning (reclamation of older/unused file groups).
+<a href="#withCompactionConfig">withCompactionConfig</a> (HoodieCompactionConfig) <br /></p>
+
+<h4 id="withCleanerPolicy">withCleanerPolicy(policy = KEEP_LATEST_COMMITS)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.cleaner.policy</code> <br />
+<span style="color:grey"> Cleaning policy to be used. Hudi will delete older versions of parquet files to re-claim space. Any Query/Computation referring to this version of the file will fail. It is good to make sure that the data is retained for more than the maximum query execution time.</span></p>
+
+<h4 id="retainCommits">retainCommits(no_of_commits_to_retain = 24)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.cleaner.commits.retained</code> <br />
+<span style="color:grey">Number of commits to retain. So data will be retained for num_of_commits * time_between_commits (scheduled). This also directly translates into how much you can incrementally pull on this table</span></p>
+
+<h4 id="archiveCommitsWith">archiveCommitsWith(minCommits = 96, maxCommits = 128)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.keep.min.commits</code>, <code class="highlighter-rouge">hoodie.keep.max.commits</code> <br />
+<span style="color:grey">Each commit is a small file in the <code class="highlighter-rouge">.hoodie</code> directory. Since DFS typically does not favor lots of small files, Hudi archives older commits into a sequential log. A commit is published atomically by a rename of the commit file.</span></p>
+
+<h4 id="withCommitsArchivalBatchSize">withCommitsArchivalBatchSize(batch = 10)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.commits.archival.batch</code> <br />
+<span style="color:grey">This controls the number of commit instants read in memory as a batch and archived together.</span></p>
+
+<h4 id="compactionSmallFileSize">compactionSmallFileSize(size = 100MB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.parquet.small.file.limit</code> <br />
+<span style="color:grey">This should be less &lt; maxFileSize and setting it to 0, turns off this feature. Small files can always happen because of the number of insert records in a partition in a batch. Hudi has an option to auto-resolve small files by masking inserts into this partition as updates to existing small files. The size here is the minimum file size considered as a “small file size”.</span></p>
+
+<h4 id="insertSplitSize">insertSplitSize(size = 500000)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.copyonwrite.insert.split.size</code> <br />
+<span style="color:grey">Insert Write Parallelism. Number of inserts grouped for a single partition. Writing out 100MB files, with atleast 1kb records, means 100K records per file. Default is to overprovision to 500K. To improve insert latency, tune this to match the number of records in a single file. Setting this to a low number, will result in small files (particularly when compactionSmallFileSize is 0)</span></p>
+
+<h4 id="autoTuneInsertSplits">autoTuneInsertSplits(true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.copyonwrite.insert.auto.split</code> <br />
+<span style="color:grey">Should hudi dynamically compute the insertSplitSize based on the last 24 commit’s metadata. Turned off by default. </span></p>
+
+<h4 id="approxRecordSize">approxRecordSize(size = 1024)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.copyonwrite.record.size.estimate</code> <br />
+<span style="color:grey">The average record size. If specified, hudi will use this and not compute dynamically based on the last 24 commit’s metadata. No value set as default. This is critical in computing the insert parallelism and bin-packing inserts into small files. See above.</span></p>
+
+<h4 id="withInlineCompaction">withInlineCompaction(inlineCompaction = false)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compact.inline</code> <br />
+<span style="color:grey">When set to true, compaction is triggered by the ingestion itself, right after a commit/deltacommit action as part of insert/upsert/bulk_insert</span></p>
+
+<h4 id="withMaxNumDeltaCommitsBeforeCompaction">withMaxNumDeltaCommitsBeforeCompaction(maxNumDeltaCommitsBeforeCompaction = 10)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compact.inline.max.delta.commits</code> <br />
+<span style="color:grey">Number of max delta commits to keep before triggering an inline compaction</span></p>
+
+<h4 id="withCompactionLazyBlockReadEnabled">withCompactionLazyBlockReadEnabled(true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compaction.lazy.block.read</code> <br />
+<span style="color:grey">When a CompactedLogScanner merges all log files, this config helps to choose whether the logblocks should be read lazily or not. Choose true to use I/O intensive lazy block reading (low memory usage) or false for Memory intensive immediate block read (high memory usage)</span></p>
+
+<h4 id="withCompactionReverseLogReadEnabled">withCompactionReverseLogReadEnabled(false)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compaction.reverse.log.read</code> <br />
+<span style="color:grey">HoodieLogFormatReader reads a logfile in the forward direction starting from pos=0 to pos=file_length. If this config is set to true, the Reader reads the logfile in reverse direction, from pos=file_length to pos=0</span></p>
+
+<h4 id="withCleanerParallelism">withCleanerParallelism(cleanerParallelism = 200)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.cleaner.parallelism</code> <br />
+<span style="color:grey">Increase this if cleaning becomes slow.</span></p>
+
+<h4 id="withCompactionStrategy">withCompactionStrategy(compactionStrategy = org.apache.hudi.io.compact.strategy.LogFileSizeBasedCompactionStrategy)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compaction.strategy</code> <br />
+<span style="color:grey">Compaction strategy decides which file groups are picked up for compaction during each compaction run. By default. Hudi picks the log file with most accumulated unmerged data</span></p>
+
+<h4 id="withTargetIOPerCompactionInMB">withTargetIOPerCompactionInMB(targetIOPerCompactionInMB = 500000)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compaction.target.io</code> <br />
+<span style="color:grey">Amount of MBs to spend during compaction run for the LogFileSizeBasedCompactionStrategy. This value helps bound ingestion latency while compaction is run inline mode.</span></p>
+
+<h4 id="withTargetPartitionsPerDayBasedCompaction">withTargetPartitionsPerDayBasedCompaction(targetPartitionsPerCompaction = 10)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compaction.daybased.target</code> <br />
+<span style="color:grey">Used by org.apache.hudi.io.compact.strategy.DayBasedCompactionStrategy to denote the number of latest partitions to compact during a compaction run.</span></p>
+
+<h4 id="payloadClassName">withPayloadClass(payloadClassName = org.apache.hudi.common.model.HoodieAvroPayload)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.compaction.payload.class</code> <br />
+<span style="color:grey">This needs to be same as class used during insert/upserts. Just like writing, compaction also uses the record payload class to merge records in the log against each other, merge again with the base file and produce the final record to be written after compaction.</span></p>
+
+<h3 id="metrics-configs">Metrics configs</h3>
+<p>Enables reporting of Hudi metrics to graphite.
+<a href="#withMetricsConfig">withMetricsConfig</a> (HoodieMetricsConfig) <br />
+<span style="color:grey">Hudi publishes metrics on every commit, clean, rollback etc.</span></p>
+
+<h4 id="on">on(metricsOn = true)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.metrics.on</code> <br />
+<span style="color:grey">Turn sending metrics on/off. on by default.</span></p>
+
+<h4 id="withReporterType">withReporterType(reporterType = GRAPHITE)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.metrics.reporter.type</code> <br />
+<span style="color:grey">Type of metrics reporter. Graphite is the default and the only value suppported.</span></p>
+
+<h4 id="toGraphiteHost">toGraphiteHost(host = localhost)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.metrics.graphite.host</code> <br />
+<span style="color:grey">Graphite host to connect to</span></p>
+
+<h4 id="onGraphitePort">onGraphitePort(port = 4756)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.metrics.graphite.port</code> <br />
+<span style="color:grey">Graphite port to connect to</span></p>
+
+<h4 id="usePrefix">usePrefix(prefix = “”)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.metrics.graphite.metric.prefix</code> <br />
+<span style="color:grey">Standard prefix applied to all metrics. This helps to add datacenter, environment information for e.g</span></p>
+
+<h3 id="memory-configs">Memory configs</h3>
+<p>Controls memory usage for compaction and merges, performed internally by Hudi
+<a href="#withMemoryConfig">withMemoryConfig</a> (HoodieMemoryConfig) <br />
+<span style="color:grey">Memory related configs</span></p>
+
+<h4 id="withMaxMemoryFractionPerPartitionMerge">withMaxMemoryFractionPerPartitionMerge(maxMemoryFractionPerPartitionMerge = 0.6)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.memory.merge.fraction</code> <br />
+<span style="color:grey">This fraction is multiplied with the user memory fraction (1 - spark.memory.fraction) to get a final fraction of heap space to use during merge </span></p>
+
+<h4 id="withMaxMemorySizePerCompactionInBytes">withMaxMemorySizePerCompactionInBytes(maxMemorySizePerCompactionInBytes = 1GB)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.memory.compaction.fraction</code> <br />
+<span style="color:grey">HoodieCompactedLogScanner reads logblocks, converts records to HoodieRecords and then merges these log blocks and records. At any point, the number of entries in a log block can be less than or equal to the number of entries in the corresponding parquet file. This can lead to OOM in the Scanner. Hence, a spillable map helps alleviate the memory pressure. Use this config to set the max allowable inMemory footprint of the spillable map.</span></p>
+
+<h4 id="withWriteStatusFailureFraction">withWriteStatusFailureFraction(failureFraction = 0.1)</h4>
+<p>Property: <code class="highlighter-rouge">hoodie.memory.writestatus.failure.fraction</code> <br />
+<span style="color:grey">This property controls what fraction of the failed record, exceptions we report back to driver</span></p>
+
+      </section>
+
+      <a href="#masthead__inner-wrap" class="back-to-top">Back to top &uarr;</a>
+
+
+      
+
+    </div>
+
+  </article>
+
+</div>
+
+    </div>
+
+    <div class="page__footer">
+      <footer>
+        
+<div class="row">
+  <div class="col-lg-12 footer">
+    <p>
+      <a class="footer-link-img" href="https://apache.org">
+        <img width="250px" src="/assets/images/asf_logo.svg" alt="The Apache Software Foundation">
+      </a>
+    </p>
+    <p>
+      Copyright &copy; <span id="copyright-year">2019</span> <a href="https://apache.org">The Apache Software Foundation</a>, Licensed under the Apache License, Version 2.0.
+      Hudi, Apache and the Apache feather logo are trademarks of The Apache Software Foundation. <a href="/docs/privacy">Privacy Policy</a>
+      <br>
+      Apache Hudi is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the <a href="http://incubator.apache.org/">Apache Incubator</a>.
+      Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have
+      stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a
+      reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
+    </p>
+  </div>
+</div>
+      </footer>
+    </div>
+
+    
+<script src="/assets/js/main.min.js"></script>
+
+
+  </body>
+</html>
\ No newline at end of file
diff --git a/content/docs/0.5.1-deployment.html b/content/docs/0.5.1-deployment.html
new file mode 100644
index 0000000..9686447
--- /dev/null
+++ b/content/docs/0.5.1-deployment.html
@@ -0,0 +1,984 @@
+<!doctype html>
+<html lang="en" class="no-js">
+  <head>
+    <meta charset="utf-8">
+
+<!-- begin _includes/seo.html --><title>Deployment Guide - Apache Hudi</title>
+<meta name="description" content="This section provides all the help you need to deploy and operate Hudi tables at scale. Specifically, we will cover the following aspects.">
+
+<meta property="og:type" content="article">
+<meta property="og:locale" content="en_US">
+<meta property="og:site_name" content="">
+<meta property="og:title" content="Deployment Guide">
+<meta property="og:url" content="https://hudi.apache.org/docs/0.5.1-deployment.html">
+
+
+  <meta property="og:description" content="This section provides all the help you need to deploy and operate Hudi tables at scale. Specifically, we will cover the following aspects.">
+
+
+
+
+
+  <meta property="article:modified_time" content="2019-12-30T14:59:57-05:00">
+
+
+
+
+
+
+
+<!-- end _includes/seo.html -->
+
+
+<!--<link href="/feed.xml" type="application/atom+xml" rel="alternate" title=" Feed">-->
+
+<!-- https://t.co/dKP3o1e -->
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+<script>
+  document.documentElement.className = document.documentElement.className.replace(/\bno-js\b/g, '') + ' js ';
+</script>
+
+<!-- For all browsers -->
+<link rel="stylesheet" href="/assets/css/main.css">
+
+<!--[if IE]>
+  <style>
+    /* old IE unsupported flexbox fixes */
+    .greedy-nav .site-title {
+      padding-right: 3em;
+    }
+    .greedy-nav button {
+      position: absolute;
+      top: 0;
+      right: 0;
+      height: 100%;
+    }
+  </style>
+<![endif]-->
+
+
+
+<link rel="icon" type="image/x-icon" href="/assets/images/favicon.ico">
+<link rel="stylesheet" href="/assets/css/font-awesome.min.css">
+
+  </head>
+
+  <body class="layout--single">
+    <!--[if lt IE 9]>
+<div class="notice--danger align-center" style="margin: 0;">You are using an <strong>outdated</strong> browser. Please <a href="https://browsehappy.com/">upgrade your browser</a> to improve your experience.</div>
+<![endif]-->
+
+    <div class="masthead">
+  <div class="masthead__inner-wrap" id="masthead__inner-wrap">
+    <div class="masthead__menu">
+      <nav id="site-nav" class="greedy-nav">
+        
+          <a class="site-logo" href="/">
+              <div style="width: 150px; height: 40px">
+              </div>
+          </a>
+        
+        <a class="site-title" href="/">
+          
+        </a>
+        <ul class="visible-links"><li class="masthead__menu-item">
+              <a href="/docs/quick-start-guide.html" target="_self" >Documentation</a>
+            </li><li class="masthead__menu-item">
+              <a href="/community.html" target="_self" >Community</a>
+            </li><li class="masthead__menu-item">
+              <a href="/activity.html" target="_self" >Activities</a>
+            </li><li class="masthead__menu-item">
+              <a href="https://cwiki.apache.org/confluence/display/HUDI/FAQ" target="_blank" >FAQ</a>
+            </li><li class="masthead__menu-item">
+              <a href="/releases.html" target="_self" >Releases</a>
+            </li></ul>
+        <button class="greedy-nav__toggle hidden" type="button">
+          <span class="visually-hidden">Toggle menu</span>
+          <div class="navicon"></div>
+        </button>
+        <ul class="hidden-links hidden"></ul>
+      </nav>
+    </div>
+  </div>
+</div>
+<!--
+<p class="notice--warning" style="margin: 0 !important; text-align: center !important;"><strong>Note:</strong> This site is work in progress, if you notice any issues, please <a target="_blank" href="https://github.com/apache/incubator-hudi/issues">Report on Issue</a>.
+  Click <a href="/"> here</a> back to old site.</p>
+-->
+
+    <div class="initial-content">
+      <div id="main" role="main">
+  
+
+  <div class="sidebar sticky">
+
+  
+
+  
+
+    
+      
+
+
+
+
+
+
+
+<nav class="nav__list">
+  
+  <input id="ac-toc" name="accordion-toc" type="checkbox" />
+  <label for="ac-toc">Toggle Menu</label>
+  <ul class="nav__items">
+    
+      <li>
+        
+          <span class="nav__sub-title">Getting Started</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-quick-start-guide.html" class="">Quick Start</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-use_cases.html" class="">Use Cases</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-powered_by.html" class="">Talks & Powered By</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-comparison.html" class="">Comparison</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-docker_demo.html" class="">Docker Demo</a></li>
+            
+
+          
+        </ul>
+        
+      </li>
+    
+      <li>
+        
+          <span class="nav__sub-title">Documentation</span>
+        
+
+        
+        <ul>
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-concepts.html" class="">Concepts</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-writing_data.html" class="">Writing Data</a></li>
+            
+
+          
+            
+            
+
+            
+            
+
+            
+              <li><a href="/docs/0.5.1-querying_data.html" class="">Querying Data</a></li>
+            
+
+          
... 12918 lines suppressed ...


Mime
View raw message