hudi-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-hudi] ramachandranms commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data page
Date Fri, 14 Feb 2020 00:15:51 GMT
ramachandranms commented on a change in pull request #1333: [HUDI-589][DOCS] Fix querying_data
page
URL: https://github.com/apache/incubator-hudi/pull/1333#discussion_r379190434
 
 

 ##########
 File path: docs/_docs/2_3_querying_data.md
 ##########
 @@ -24,31 +24,49 @@ If `table name = hudi_trips` and `table type = MERGE_ON_READ`, then we
get:
  
 
 As discussed in the concepts section, the one key primitive needed for [incrementally processing](https://www.oreilly.com/ideas/ubers-case-for-incremental-processing-on-hadoop),
-is `incremental pulls` (to obtain a change stream/log from a table). Hudi tables can be pulled
incrementally, which means you can get ALL and ONLY the updated & new rows 
-since a specified instant time. This, together with upserts, are particularly useful for
building data pipelines where 1 or more source Hudi tables are incrementally pulled (streams/facts),
-joined with other tables (tables/dimensions), to [write out deltas](/docs/writing_data.html)
to a target Hudi table. Incremental view is realized by querying one of the tables above,

+is obtaining a change stream/log from a table. Hudi tables can be queried incrementally,
which means you can get ALL and ONLY the updated & new rows 
+since a specified instant time. This, together with upserts, is particularly useful for building
data pipelines where 1 or more source Hudi tables are incrementally queried (streams/facts),
+joined with other tables (tables/dimensions), to [write out deltas](/docs/writing_data.html)
to a target Hudi table. Incremental queries are realized by querying one of the tables above,

 with special configurations that indicates to query planning that only incremental data needs
to be fetched out of the table. 
 
-In sections, below we will discuss how to access these query types from different query engines.
+
+## SUPPORT MATRIX
+
+### COPY_ON_WRITE tables
+  
+||Snapshot|Incremental|Read Optimized|
+||--------|-----------|--------------|
+|**Hive**|Y|Y|N/A|
+|**Spark datasource**|Y|Y|N/A|
+|**Spark SQL**|Y|Y|N/A|
+|**Presto**|Y|N|N/A|
+
+### MERGE_ON_READ tables
+
+||Snapshot|Incremental|Read Optimized|
+||--------|-----------|--------------|
+|**Hive**|Y|Y|Y|
+|**Spark datasource**|N|N|Y|
+|**Spark SQL**|Y|Y|Y|
+|**Presto**|N|N|Y|
 
 Review comment:
   `Presto` section below says snapshot queries are supported for presto. Support matrix says
`N`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message