carbondata-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chenliang...@apache.org
Subject [1/2] incubator-carbondata git commit: added test cases for ThriftWriter
Date Fri, 20 Jan 2017 21:45:20 GMT
Repository: incubator-carbondata
Updated Branches:
  refs/heads/master bd09f9bc1 -> e476f05a8


added test cases for ThriftWriter

Added MD Files for File Structure and Data Types

removed unused files

removed overview-of-carbondata.md


Project: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/commit/34cb9cf6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/tree/34cb9cf6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/diff/34cb9cf6

Branch: refs/heads/master
Commit: 34cb9cf639d7a36614be5816c60fdd0a6253b9f1
Parents: bd09f9b
Author: PallaviSingh1992 <pallavisingh_1992@yahoo.co.in>
Authored: Thu Nov 24 11:24:37 2016 +0530
Committer: chenliang613 <chenliang613@huawei.com>
Committed: Sat Jan 21 05:43:56 2017 +0800

----------------------------------------------------------------------
 README.md                                  |  2 ++
 docs/file-structure-of-carbondata.md       | 17 +++++++++++++++++
 docs/supported-data-types-in-carbondata.md | 20 ++++++++++++++++++++
 3 files changed, 39 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/34cb9cf6/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index 07f7ebf..db1a1e2 100644
--- a/README.md
+++ b/README.md
@@ -40,6 +40,8 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com
 
 ## Online Documentation
 * [Quick Start](https://github.com/apache/incubator-carbondata/blob/master/docs/quick-start-guide.md)
+* [CarbonData File Structure](https://github.com/apache/incubator-carbondata/blob/master/docs/file-structure-of-carbondata.md)
+* [Data Types](https://github.com/apache/incubator-carbondata/blob/master/docs/supported-data-types-in-carbondata.md)
 * [Data Management](https://github.com/apache/incubator-carbondata/blob/master/docs/data-management.md)
 * [DDL Operations on CarbonData](https://github.com/apache/incubator-carbondata/blob/master/docs/ddl-operation-on-carbondata.md)

 * [DML Operations on CarbonData](https://github.com/apache/incubator-carbondata/blob/master/docs/dml-operation-on-carbondata.md)
 

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/34cb9cf6/docs/file-structure-of-carbondata.md
----------------------------------------------------------------------
diff --git a/docs/file-structure-of-carbondata.md b/docs/file-structure-of-carbondata.md
new file mode 100644
index 0000000..fd0f708
--- /dev/null
+++ b/docs/file-structure-of-carbondata.md
@@ -0,0 +1,17 @@
+#  CarbonData File Structure
+
+CarbonData files contain groups of data called blocklets, along with all required information
like schema, offsets and indices etc, in a file footer, co-located in HDFS.
+
+The file footer can be read once to build the indices in memory, which can be utilized for
optimizing the scans and processing for all subsequent queries.
+
+Each blocklet in the file is further divided into chunks of data called data chunks. Each
data chunk is organized either in columnar format or row format, and stores the data of either
a single column or a set of columns. All blocklets in a file contain the same number and type
of data chunks.
+
+![CarbonData File Structure](../docs/images/carbon_data_file_structure_new.png?raw=true)
+
+Each data chunk contains multiple groups of data called as pages. There are three types of
pages.
+
+* Data Page: Contains the encoded data of a column/group of columns.
+* Row ID Page (optional): Contains the row ID mappings used when the data page is stored
as an inverted index.
+* RLE Page (optional): Contains additional metadata used when the data page is RLE coded.
+
+![CarbonData File Format](../docs/images/carbon_data_format_new.png?raw=true)

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/34cb9cf6/docs/supported-data-types-in-carbondata.md
----------------------------------------------------------------------
diff --git a/docs/supported-data-types-in-carbondata.md b/docs/supported-data-types-in-carbondata.md
new file mode 100644
index 0000000..01bd6e3
--- /dev/null
+++ b/docs/supported-data-types-in-carbondata.md
@@ -0,0 +1,20 @@
+#  Data Types
+
+#### CarbonData supports the following data types:
+
+  * Numeric Types
+  * SMALLINT
+  * INT/INTEGER
+  * BIGINT
+  * DOUBLE
+  * DECIMAL
+
+  * Date/Time Types
+  * TIMESTAMP
+
+  * String Types
+  * STRING
+
+  * Complex Types
+    * arrays: ARRAY``<data_type>``
+    * structs: STRUCT``<col_name : data_type COMMENT col_comment, ...>``


Mime
View raw message