hawq-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yo...@apache.org
Subject [4/5] incubator-hawq-docs git commit: Removes heap table statement, updates [#128180963]
Date Fri, 19 Aug 2016 17:48:20 GMT
Removes heap table statement, updates [#128180963]

Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/42fa1bc9
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/42fa1bc9
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/42fa1bc9

Branch: refs/heads/develop
Commit: 42fa1bc9363fc6104fe575e033129a1d5701c185
Parents: 2349cea
Author: Jane Beckman <jbeckman@pivotal.io>
Authored: Thu Aug 18 11:39:47 2016 -0700
Committer: David Yozie <yozie@apache.org>
Committed: Fri Aug 19 10:47:57 2016 -0700

 reference/sql/CREATE-TABLE.html.md.erb | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/reference/sql/CREATE-TABLE.html.md.erb b/reference/sql/CREATE-TABLE.html.md.erb
index 5d1098b..99ff35e 100644
--- a/reference/sql/CREATE-TABLE.html.md.erb
+++ b/reference/sql/CREATE-TABLE.html.md.erb
@@ -228,7 +228,7 @@ The following storage options are available:
 **bucketnum** — Set to the number of hash buckets to be used in creating a hash-distributed
table, specified as an integer greater than 0 and no more than the value of `default_hash_table_bucket_number`.
The default when the table is created is 6 times the segment count. However, explicitly setting
the bucket number when creating a hash table is recommended.
-**ORIENTATION** — Set to `row` (the default) for row-oriented storage, or parquet. The
parquet column-oriented format can be more efficient for large-scale queries. This option
is only valid if `APPENDONLY=TRUE`. Heap-storage tables can only be row-oriented.
+**ORIENTATION** — Set to `row` (the default) for row-oriented storage, or parquet. The
parquet column-oriented format can be more efficient for large-scale queries. This option
is only valid if `APPENDONLY=TRUE`. 
 **COMPRESSTYPE** — Set to `ZLIB`, `SNAPPY`, or `GZIP` to specify the type of compression
used. `ZLIB` provides more compact compression ratios at lower speeds. Parquet tables support
`SNAPPY` and `GZIP` compression. Append-only tables support `SNAPPY` and `ZLIB` compression.
 This option is valid only if `APPENDONLY=TRUE`.
@@ -328,8 +328,8 @@ Using `SNAPPY` compression with parquet files is recommended for best
 **Memory occupation**: When inserting or loading data to a parquet table, the whole rowgroup
is stored in physical memory until the size exceeds the threshold or the end of the `INSERT`
operation. Once either occurs, the entire rowgroup is flushed to disk. Also, at the beginning
of the `INSERT` operation, each column is pre-allocated a page buffer. The column pre-allocated
page buffer size should be `min(pageSizeLimit,                rowgroupSizeLimit/estimatedColumnWidth/estimatedRecordWidth)`
for the first rowgroup. For the following rowgroup, it should be `min(pageSizeLimit,    
           actualColumnChunkSize in last rowgroup * 1.05)`, of which 1.05 is the estimated
scaling factor. When reading data from a parquet table, the requested columns of the row
group are loaded into memory. Memory is allocated 8 MB by default. Ensure that memory occupation
does not exceed physical memory when setting `ROWGROUPSIZE` or `PAGESIZE`, otherwise you may
encounter an out of memory erro
-**Batch vs. individual inserts**
-Only batch loading should be used with parquet files. Repeated individual inserts can result
in bloated footers.
+**Bulk vs. trickle loads**
+Only bulk loads are recommended for use with parquet tables. Trickle loads can result in
bloated footers and larger data files.
 ## <a id="parquetexamples"></a>Parquet Examples

View raw message