carbondata-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chenliang...@apache.org
Subject [21/39] carbondata-site git commit: Handled comments
Date Fri, 07 Sep 2018 16:54:08 GMT
http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/bloomfilter-datamap-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/bloomfilter-datamap-guide.md b/src/site/markdown/bloomfilter-datamap-guide.md
index dd590e1..b2e7d60 100644
--- a/src/site/markdown/bloomfilter-datamap-guide.md
+++ b/src/site/markdown/bloomfilter-datamap-guide.md
@@ -73,7 +73,7 @@ For instance, main table called **datamap_test** which is defined as:
     age int,
     city string,
     country string)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   TBLPROPERTIES('SORT_COLUMNS'='id')
   ```
 
@@ -146,14 +146,3 @@ You can refer to the corresponding section in `CarbonData Lucene DataMap`.
  there is still a chance that BloomFilter datamap can enhance the performance for concurrent query.
 + Note that BloomFilter datamap will decrease the data loading performance and may cause slightly storage expansion (for datamap index file).
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__datamap').addClass('selected');
-  
-  if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
-    // Display datamap subnav items
-    $('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/configuration-parameters.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/configuration-parameters.md b/src/site/markdown/configuration-parameters.md
index de72439..c8c74f2 100644
--- a/src/site/markdown/configuration-parameters.md
+++ b/src/site/markdown/configuration-parameters.md
@@ -16,7 +16,7 @@
 -->
 
 # Configuring CarbonData
- This guide explains the configurations that can be used to tune CarbonData to achieve better performance.Some of the properties can be set dynamically and are explained in the section Dynamic Configuration In CarbonData Using SET-RESET.Most of the properties that control the internal settings have reasonable default values.They are listed along with the properties along with explanation.
+ This guide explains the configurations that can be used to tune CarbonData to achieve better performance.Most of the properties that control the internal settings have reasonable default values.They are listed along with the properties along with explanation.
 
  * [System Configuration](#system-configuration)
  * [Data Loading Configuration](#data-loading-configuration)
@@ -59,7 +59,7 @@ This section provides the details of all the configurations required for the Car
 | carbon.bad.records.action | FAIL | CarbonData in addition to identifying the bad records, can take certain actions on such data.This configuration can have four types of actions for bad records namely FORCE, REDIRECT, IGNORE and FAIL. If set to FORCE then it auto-corrects the data by storing the bad records as NULL. If set to REDIRECT then bad records are written to the raw CSV instead of being loaded. If set to IGNORE then bad records are neither loaded nor written to the raw CSV. If set to FAIL then data loading fails if any bad records are found. |
 | carbon.options.is.empty.data.bad.record | false | Based on the business scenarios, empty("" or '' or ,,) data can be valid or invalid. This configuration controls how empty data should be treated by CarbonData. If false, then empty ("" or '' or ,,) data will not be considered as bad record and vice versa. |
 | carbon.options.bad.record.path | (none) | Specifies the HDFS path where bad records are to be stored. By default the value is Null. This path must to be configured by the user if ***carbon.options.bad.records.logger.enable*** is **true** or ***carbon.bad.records.action*** is **REDIRECT**. |
-| carbon.blockletgroup.size.in.mb | 64 | Please refer to [file-structure-of-carbondata](./file-structure-of-carbondata.md ) to understand the storage format of CarbonData.The data are read as a group of blocklets which are called blocklet groups. This parameter specifies the size of each blocklet group. Higher value results in better sequential IO access.The minimum value is 16MB, any value lesser than 16MB will reset to the default value (64MB).**NOTE:** Configuring a higher value might lead to poor performance as an entire blocklet group will have to read into memory before processing.For filter queries with limit, it is **not advisable** to have a bigger blocklet size.For Aggregation queries which need to return more number of rows,bigger blocklet size is advisable. |
+| carbon.blockletgroup.size.in.mb | 64 | Please refer to [file-structure-of-carbondata](./file-structure-of-carbondata.md#carbondata-file-format) to understand the storage format of CarbonData.The data are read as a group of blocklets which are called blocklet groups. This parameter specifies the size of each blocklet group. Higher value results in better sequential IO access.The minimum value is 16MB, any value lesser than 16MB will reset to the default value (64MB).**NOTE:** Configuring a higher value might lead to poor performance as an entire blocklet group will have to read into memory before processing.For filter queries with limit, it is **not advisable** to have a bigger blocklet size.For Aggregation queries which need to return more number of rows,bigger blocklet size is advisable. |
 | carbon.sort.file.write.buffer.size | 16384 | CarbonData sorts and writes data to intermediate files to limit the memory usage.This configuration determines the buffer size to be used for reading and writing such files. **NOTE:** This configuration is useful to tune IO and derive optimal performance.Based on the OS and underlying harddisk type, these values can significantly affect the overall performance.It is ideal to tune the buffersize equivalent to the IO buffer size of the OS.Recommended range is between 10240 to 10485760 bytes. |
 | carbon.sort.intermediate.files.limit | 20 | CarbonData sorts and writes data to intermediate files to limit the memory usage.Before writing the target carbondat file, the data in these intermediate files needs to be sorted again so as to ensure the entire data in the data load is sorted.This configuration determines the minimum number of intermediate files after which merged sort is applied on them sort the data.**NOTE:** Intermediate merging happens on a separate thread in the background.Number of threads used is determined by ***carbon.merge.sort.reader.thread***.Configuring a low value will cause more time to be spent in merging these intermediate merged files which can cause more IO.Configuring a high value would cause not to use the idle threads to do intermediate sort merges.Range of recommended values are between 2 and 50 |
 | carbon.csv.read.buffersize.byte | 1048576 | CarbonData uses Hadoop InputFormat to read the csv files.This configuration value is used to pass buffer size as input for the Hadoop MR job when reading the csv files.This value is configured in bytes.**NOTE:** Refer to ***org.apache.hadoop.mapreduce.InputFormat*** documentation for additional information. |
@@ -70,7 +70,7 @@ This section provides the details of all the configurations required for the Car
 | carbon.enable.calculate.size | true | **For Load Operation**: Setting this property calculates the size of the carbon data file (.carbondata) and carbon index file (.carbonindex) for every load and updates the table status file. **For Describe Formatted**: Setting this property calculates the total size of the carbon data files and carbon index files for the respective table and displays in describe formatted command.**NOTE:** This is useful to determine the overall size of the carbondata table and also get an idea of how the table is growing in order to take up other backup strategy decisions. |
 | carbon.cutOffTimestamp | (none) | CarbonData has capability to generate the Dictionary values for the timestamp columns from the data itself without the need to store the computed dictionary values. This configuration sets the start date for calculating the timestamp. Java counts the number of milliseconds from start of "1970-01-01 00:00:00". This property is used to customize the start of position. For example "2000-01-01 00:00:00". **NOTE:** The date must be in the form ***carbon.timestamp.format***. CarbonData supports storing data for upto 68 years.For example, if the cut-off time is 1970-01-01 05:30:00, then data upto 2038-01-01 05:30:00 will be supported by CarbonData. |
 | carbon.timegranularity | SECOND | The configuration is used to specify the data granularity level such as DAY, HOUR, MINUTE, or SECOND.This helps to store more than 68 years of data into CarbonData. |
-| carbon.use.local.dir | false | CarbonData during data loading, writes files to local temp directories before copying the files to HDFS.This configuration is used to specify whether CarbonData can write locally to tmp directory of the container or to the YARN application directory. |
+| carbon.use.local.dir | false | CarbonData,during data loading, writes files to local temp directories before copying the files to HDFS.This configuration is used to specify whether CarbonData can write locally to tmp directory of the container or to the YARN application directory. |
 | carbon.use.multiple.temp.dir | false | When multiple disks are present in the system, YARN is generally configured with multiple disks to be used as temp directories for managing the containers.This configuration specifies whether to use multiple YARN local directories during data loading for disk IO load balancing.Enable ***carbon.use.local.dir*** for this configuration to take effect.**NOTE:** Data Loading is an IO intensive operation whose performance can be limited by the disk IO threshold, particularly during multi table concurrent data load.Configuring this parameter, balances the disk IO across multiple disks there by improving the over all load performance. |
 | carbon.sort.temp.compressor | (none) | CarbonData writes every ***carbon.sort.size*** number of records to intermediate temp files during data loading to ensure memory footprint is within limits.These temporary files cab be compressed and written in order to save the storage space.This configuration specifies the name of compressor to be used to compress the intermediate sort temp files during sort procedure in data loading.The valid values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty. By default, empty means that Carbondata will not compress the sort temp files.**NOTE:** Compressor will be useful if you encounter disk bottleneck.Since the data needs to be compressed and decompressed,it involves additional CPU cycles,but is compensated by the high IO throughput due to less data to be written or read from the disks. |
 | carbon.load.skewedDataOptimization.enabled | false | During data loading,CarbonData would divide the number of blocks equally so as to ensure all executors process same number of blocks.This mechanism satisfies most of the scenarios and ensures maximum parallel processing for optimal data loading performance.In some business scenarios, there might be scenarios where the size of blocks vary significantly and hence some executors would have to do more work if they get blocks containing more data. This configuration enables size based block allocation strategy for data loading.When loading, carbondata will use file size based block allocation strategy for task distribution. It will make sure that all the executors process the same size of data.**NOTE:** This configuration is useful if the size of your input data files varies widely, say 1MB~1GB.For this configuration to work effectively,knowing the data pattern and size is important and necessary. |
@@ -107,7 +107,7 @@ This section provides the details of all the configurations required for the Car
 | carbon.numberof.preserve.segments | 0 | If the user wants to preserve some number of segments from being compacted then he can set this configuration. Example: carbon.numberof.preserve.segments = 2 then 2 latest segments will always be excluded from the compaction. No segments will be preserved by default.**NOTE:** This configuration is useful when the chances of input data can be wrong due to environment scenarios.Preserving some of the latest segments from being compacted can help to easily delete the wrongly loaded segments.Once compacted,it becomes more difficult to determine the exact data to be deleted(except when data is incrementing according to time) |
 | carbon.allowed.compaction.days | 0 | This configuration is used to control on the number of recent segments that needs to be compacted, ignoring the older ones.This congifuration is in days.For Example: If the configuration is 2, then the segments which are loaded in the time frame of past 2 days only will get merged. Segments which are loaded earlier than 2 days will not be merged. This configuration is disabled by default.**NOTE:** This configuration is useful when a bulk of history data is loaded into the carbondata.Query on this data is less frequent.In such cases involving these segments also into compacation will affect the resource consumption, increases overall compaction time. |
 | carbon.enable.auto.load.merge | false | Compaction can be automatically triggered once data load completes.This ensures that the segments are merged in time and thus query times doesnt increase with increase in segments.This configuration enables to do compaction along with data loading.**NOTE: **Compaction will be triggered once the data load completes.But the status of data load wait till the compaction is completed.Hence it might look like data loading time has increased, but thats not the case.Moreover failure of compaction will not affect the data loading status.If data load had completed successfully, the status would be updated and segments are committed.However, failure while data loading, will not trigger compaction and error is returned immediately. |
-| carbon.enable.page.level.reader.in.compaction|true|Enabling page level reader for compaction reduces the memory usage while compacting more number of segments. It allows reading only page by page instead of reading whole blocklet to memory.**NOTE:** Please refer to [file-structure-of-carbondata](./file-structure-of-carbondata.md ) to understand the storage format of CarbonData and concepts of pages.|
+| carbon.enable.page.level.reader.in.compaction|true|Enabling page level reader for compaction reduces the memory usage while compacting more number of segments. It allows reading only page by page instead of reading whole blocklet to memory.**NOTE:** Please refer to [file-structure-of-carbondata](./file-structure-of-carbondata.md#carbondata-file-format) to understand the storage format of CarbonData and concepts of pages.|
 | carbon.concurrent.compaction | true | Compaction of different tables can be executed concurrently.This configuration determines whether to compact all qualifying tables in parallel or not.**NOTE: **Compacting concurrently is a resource demanding operation and needs more resouces there by affecting the query performance also.This configuration is **deprecated** and might be removed in future releases. |
 | carbon.compaction.prefetch.enable | false | Compaction operation is similar to Query + data load where in data from qualifying segments are queried and data loading performed to generate a new single segment.This configuration determines whether to query ahead data from segments and feed it for data loading.**NOTE: **This configuration is disabled by default as it needs extra resources for querying ahead extra data.Based on the memory availability on the cluster, user can enable it to improve compaction performance. |
 | carbon.merge.index.in.segment | true | Each CarbonData file has a companion CarbonIndex file which maintains the metadata about the data.These CarbonIndex files are read and loaded into driver and is used subsequently for pruning of data during queries.These CarbonIndex files are very small in size(few KB) and are many.Reading many small files from HDFS is not efficient and leads to slow IO performance.Hence these CarbonIndex files belonging to a segment can be combined into  a single file and read once there by increasing the IO throughput.This configuration enables to merge all the CarbonIndex files into a single MergeIndex file upon data loading completion.**NOTE:** Reading a single big file is more efficient in HDFS and IO throughput is very high.Due to this the time needed to load the index files into memory when query is received for the first time on that table is significantly reduced and there by significantly reduces the delay in serving the first query. |
@@ -235,16 +235,3 @@ RESET
 * Success will be recorded in the driver log.
 
 * Failure will be displayed in the UI.
-
-
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__docs').addClass('selected');
-
-  // Display docs subnav items
-  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) {
-    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/datamap-developer-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/datamap-developer-guide.md b/src/site/markdown/datamap-developer-guide.md
index 52c993c..6bac9b5 100644
--- a/src/site/markdown/datamap-developer-guide.md
+++ b/src/site/markdown/datamap-developer-guide.md
@@ -15,16 +15,5 @@ Currently, the provider string can be:
 
 When user issues `DROP DATAMAP dm ON TABLE main`, the corresponding DataMapProvider interface will be called.
 
-Details about [DataMap Management](./datamap-management.md#datamap-management) and supported [DSL](./datamap-management.md#overview) are documented [here](./datamap-management.md).
+Details about [DataMap Management](./datamap/datamap-management.md#datamap-management) and supported [DSL](./datamap/datamap-management.md#overview) are documented [here](./datamap/datamap-management.md).
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__docs').addClass('selected');
-
-  // Display docs subnav items
-  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) {
-    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/datamap-management.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/datamap-management.md b/src/site/markdown/datamap-management.md
index cc17d69..eee03a7 100644
--- a/src/site/markdown/datamap-management.md
+++ b/src/site/markdown/datamap-management.md
@@ -149,15 +149,3 @@ This feature applies for preaggregate datamap only
 Running Compaction command (`ALTER TABLE COMPACT`) on main table will **not automatically** compact the pre-aggregate tables created on the main table. User need to run Compaction command separately on each pre-aggregate table to compact them.
 
 Compaction is an optional operation for pre-aggregate table. If compaction is performed on main table but not performed on pre-aggregate table, all queries still can benefit from pre-aggregate tables. To further improve the query performance, compaction on pre-aggregate tables can be triggered to merge the segments and files in the pre-aggregate tables.
-
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__datamap').addClass('selected');
-  
-  if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
-    // Display datamap subnav items
-    $('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/ddl-of-carbondata.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/ddl-of-carbondata.md b/src/site/markdown/ddl-of-carbondata.md
index 5535a40..acaac43 100644
--- a/src/site/markdown/ddl-of-carbondata.md
+++ b/src/site/markdown/ddl-of-carbondata.md
@@ -75,15 +75,31 @@ CarbonData DDL statements are documented here,which includes:
   **NOTE:** CarbonData also supports "STORED AS carbondata" and "USING carbondata". Find example code at [CarbonSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala) in the CarbonData repo.
 ### Usage Guidelines
 
-**Supported properties:** [DICTIONARY_INCLUDE](#dictionary-encoding-configuration),[NO_INVERTED_INDEX](#inverted-index-configuration),[SORT_COLUMNS](#sort-columns-configuration),[SORT_SCOPE](#sort-scope-configuration),[TABLE_BLOCKSIZE](#table-block-size-configuration),[MAJOR_COMPACTION_SIZE](#table-compaction-configuration),
-
-[AUTO_LOAD_MERGE](#table-compaction-configuration),[COMPACTION_LEVEL_THRESHOLD](#table-compaction-configuration),[COMPACTION_PRESERVE_SEGMENTS](#table-compaction-configuration),[ALLOWED_COMPACTION_DAYS](#table-compaction-configuration),
-
-[streaming](#streaming),[LOCAL_DICTIONARY_ENABLE](#local-dictionary-configuration),[LOCAL_DICTIONARY_THRESHOLD](#local-dictionary-configuration),[LOCAL_DICTIONARY_INCLUDE](#local-dictionary-configuration),
-
-[LOCAL_DICTIONARY_EXCLUDE](#local-dictionary-configuration),[COLUMN_META_CACHE](#caching-minmax-value-for-required-columns),[CACHE_LEVEL](#caching-at-block-or-blocklet-level),[flat_folder](#support-flat-folder-same-as-hiveparquet),[LONG_STRING_COLUMNS](#string-longer-than-32000-characters),[BUCKETNUMBER](#bucketing),
-
-[BUCKETCOLUMNS](#bucketing)
+**Supported properties:**
+
+| Property                                                     | Description                                                  |
+| ------------------------------------------------------------ | ------------------------------------------------------------ |
+| [DICTIONARY_INCLUDE](#dictionary-encoding-configuration)     | Columns for which dictionary needs to be generated           |
+| [NO_INVERTED_INDEX](#inverted-index-configuration)           | Columns to exclude from inverted index generation            |
+| [SORT_COLUMNS](#sort-columns-configuration)                  | Columns to include in sort and its order of sort             |
+| [SORT_SCOPE](#sort-scope-configuration)                      | Sort scope of the load.Options include no sort, local sort ,batch sort and global sort |
+| [TABLE_BLOCKSIZE](#table-block-size-configuration)           | Size of blocks to write onto hdfs                            |
+| [MAJOR_COMPACTION_SIZE](#table-compaction-configuration)     | Size upto which the segments can be combined into one        |
+| [AUTO_LOAD_MERGE](#table-compaction-configuration)           | Whether to auto compact the segments                         |
+| [COMPACTION_LEVEL_THRESHOLD](#table-compaction-configuration) | Number of segments to compact into one segment               |
+| [COMPACTION_PRESERVE_SEGMENTS](#table-compaction-configuration) | Number of latest segments that needs to be excluded from compaction |
+| [ALLOWED_COMPACTION_DAYS](#table-compaction-configuration)   | Segments generated within the configured time limit in days will be compacted, skipping others |
+| [streaming](#streaming)                                      | Whether the table is a streaming table                       |
+| [LOCAL_DICTIONARY_ENABLE](#local-dictionary-configuration)   | Enable local dictionary generation                           |
+| [LOCAL_DICTIONARY_THRESHOLD](#local-dictionary-configuration) | Cardinality upto which the local dictionary can be generated |
+| [LOCAL_DICTIONARY_INCLUDE](#local-dictionary-configuration)  | Columns for which local dictionary needs to be generated.Useful when local dictionary need not be generated for all string/varchar/char columns |
+| [LOCAL_DICTIONARY_EXCLUDE](#local-dictionary-configuration)  | Columns for which local dictionary generation should be skipped.Useful when local dictionary need not be generated for few string/varchar/char columns |
+| [COLUMN_META_CACHE](#caching-minmax-value-for-required-columns) | Columns whose metadata can be cached in Driver for efficient pruning and improved query performance |
+| [CACHE_LEVEL](#caching-at-block-or-blocklet-level)           | Column metadata caching level.Whether to cache column metadata of block or blocklet |
+| [flat_folder](#support-flat-folder-same-as-hiveparquet)      | Whether to write all the carbondata files in a single folder.Not writing segments folder during incremental load |
+| [LONG_STRING_COLUMNS](#string-longer-than-32000-characters)  | Columns which are greater than 32K characters                |
+| [BUCKETNUMBER](#bucketing)                                   | Number of buckets to be created                              |
+| [BUCKETCOLUMNS](#bucketing)                                  | Columns which are to be placed in buckets                    |
 
  Following are the guidelines for TBLPROPERTIES, CarbonData's additional table options can be set via carbon.properties.
 
@@ -135,15 +151,15 @@ CarbonData DDL statements are documented here,which includes:
 
    ```
     CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                   productNumber INT,
-                                   productName STRING,
-                                   storeCity STRING,
-                                   storeProvince STRING,
-                                   productCategory STRING,
-                                   productBatch STRING,
-                                   saleQuantity INT,
-                                   revenue INT)
-    STORED BY 'carbondata'
+      productNumber INT,
+      productName STRING,
+      storeCity STRING,
+      storeProvince STRING,
+      productCategory STRING,
+      productBatch STRING,
+      saleQuantity INT,
+      revenue INT)
+    STORED AS carbondata
     TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity',
                    'SORT_SCOPE'='NO_SORT')
    ```
@@ -222,10 +238,10 @@ CarbonData DDL statements are documented here,which includes:
 
 | Properties | Default value | Description |
 | ---------- | ------------- | ----------- |
-| LOCAL_DICTIONARY_ENABLE | false | Whether to enable local dictionary generation. **NOTE:** If this property is defined, it will override the value configured at system level by 'carbon.local.dictionary.enable' |
+| LOCAL_DICTIONARY_ENABLE | false | Whether to enable local dictionary generation. **NOTE:** If this property is defined, it will override the value configured at system level by '***carbon.local.dictionary.enable***'.Local dictionary will be generated for all string/varchar/char columns unless LOCAL_DICTIONARY_INCLUDE, LOCAL_DICTIONARY_EXCLUDE is configured. |
 | LOCAL_DICTIONARY_THRESHOLD | 10000 | The maximum cardinality of a column upto which carbondata can try to generate local dictionary (maximum - 100000) |
-| LOCAL_DICTIONARY_INCLUDE | string/varchar/char columns| Columns for which Local Dictionary has to be generated.**NOTE:** Those string/varchar/char columns which are added into DICTIONARY_INCLUDE option will not be considered for local dictionary generation.|
-| LOCAL_DICTIONARY_EXCLUDE | none | Columns for which Local Dictionary need not be generated. |
+| LOCAL_DICTIONARY_INCLUDE | string/varchar/char columns| Columns for which Local Dictionary has to be generated.**NOTE:** Those string/varchar/char columns which are added into DICTIONARY_INCLUDE option will not be considered for local dictionary generation.This property needs to be configured only when local dictionary needs to be generated for few columns, skipping others.This property takes effect only when **LOCAL_DICTIONARY_ENABLE** is true or **carbon.local.dictionary.enable** is true |
+| LOCAL_DICTIONARY_EXCLUDE | none | Columns for which Local Dictionary need not be generated.This property needs to be configured only when local dictionary needs to be skipped for few columns, generating for others.This property takes effect only when **LOCAL_DICTIONARY_ENABLE** is true or **carbon.local.dictionary.enable** is true |
 
    **Fallback behavior:** 
 
@@ -252,7 +268,7 @@ CarbonData DDL statements are documented here,which includes:
              
                column3 LONG )
              
-     STORED BY 'carbondata'
+     STORED AS carbondata
      TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','LOCAL_DICTIONARY_THRESHOLD'='1000',
      'LOCAL_DICTIONARY_INCLUDE'='column1','LOCAL_DICTIONARY_EXCLUDE'='column2')
    ```
@@ -407,7 +423,7 @@ CarbonData DDL statements are documented here,which includes:
 
   ```
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name 
-  STORED BY 'carbondata' 
+  STORED AS carbondata 
   [TBLPROPERTIES (key1=val1, key2=val2, ...)] 
   AS select_statement;
   ```
@@ -424,7 +440,7 @@ CarbonData DDL statements are documented here,which includes:
   carbon.sql("INSERT INTO source_table SELECT 2,'david','shenzhen',31")
   
   carbon.sql("CREATE TABLE target_table
-              STORED BY 'carbondata'
+              STORED AS carbondata
               AS SELECT city,avg(age) FROM source_table GROUP BY city")
               
   carbon.sql("SELECT * FROM target_table").show
@@ -441,7 +457,7 @@ CarbonData DDL statements are documented here,which includes:
   This function allows user to create external table by specifying location.
   ```
   CREATE EXTERNAL TABLE [IF NOT EXISTS] [db_name.]table_name 
-  STORED BY 'carbondata' LOCATION ‘$FilesPath’
+  STORED AS carbondata LOCATION ‘$FilesPath’
   ```
 
 ### Create external table on managed table data location.
@@ -450,14 +466,14 @@ CarbonData DDL statements are documented here,which includes:
 
   **Example:**
   ```
-  sql("CREATE TABLE origin(key INT, value STRING) STORED BY 'carbondata'")
+  sql("CREATE TABLE origin(key INT, value STRING) STORED AS carbondata")
   sql("INSERT INTO origin select 100,'spark'")
   sql("INSERT INTO origin select 200,'hive'")
   // creates a table in $storeLocation/origin
   
   sql(s"""
   |CREATE EXTERNAL TABLE source
-  |STORED BY 'carbondata'
+  |STORED AS carbondata
   |LOCATION '$storeLocation/origin'
   """.stripMargin)
   checkAnswer(sql("SELECT count(*) from source"), sql("SELECT count(*) from origin"))
@@ -470,7 +486,7 @@ CarbonData DDL statements are documented here,which includes:
   **Example:**
   ```
   sql(
-  s"""CREATE EXTERNAL TABLE sdkOutputTable STORED BY 'carbondata' LOCATION
+  s"""CREATE EXTERNAL TABLE sdkOutputTable STORED AS carbondata LOCATION
   |'$writerPath' """.stripMargin)
   ```
 
@@ -670,7 +686,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   ```
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type [COMMENT col_comment], ...)]
     [COMMENT table_comment]
-  STORED BY 'carbondata'
+  STORED AS carbondata
   [TBLPROPERTIES (property_name=property_value, ...)]
   ```
 
@@ -679,7 +695,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
                                 productNumber Int COMMENT 'unique serial number for product')
   COMMENT “This is table comment”
-   STORED BY 'carbondata'
+   STORED AS carbondata
    TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber')
   ```
   You can also SET and UNSET table comment using ALTER command.
@@ -725,7 +741,7 @@ Users can specify which columns to include and exclude for local dictionary gene
                                 saleQuantity INT,
                                 revenue INT)
   PARTITIONED BY (productCategory STRING, productBatch STRING)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   ```
    NOTE: Hive partition is not supported on complex datatype columns.
 		
@@ -780,7 +796,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
                     [(col_name data_type , ...)]
   PARTITIONED BY (partition_col_name data_type)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   [TBLPROPERTIES ('PARTITION_TYPE'='HASH',
                   'NUM_PARTITIONS'='N' ...)]
   ```
@@ -796,7 +812,7 @@ Users can specify which columns to include and exclude for local dictionary gene
       col_D DECIMAL(10,2),
       col_F TIMESTAMP
   ) PARTITIONED BY (col_E LONG)
-  STORED BY 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='9')
+  STORED AS carbondata TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='9')
   ```
 
 ### Create Range Partition Table
@@ -806,7 +822,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
                     [(col_name data_type , ...)]
   PARTITIONED BY (partition_col_name data_type)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   [TBLPROPERTIES ('PARTITION_TYPE'='RANGE',
                   'RANGE_INFO'='2014-01-01, 2015-01-01, 2016-01-01, ...')]
   ```
@@ -836,7 +852,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
                     [(col_name data_type , ...)]
   PARTITIONED BY (partition_col_name data_type)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   [TBLPROPERTIES ('PARTITION_TYPE'='LIST',
                   'LIST_INFO'='A, B, C, ...')]
   ```
@@ -851,7 +867,7 @@ Users can specify which columns to include and exclude for local dictionary gene
       col_E LONG,
       col_F TIMESTAMP
    ) PARTITIONED BY (col_A STRING)
-   STORED BY 'carbondata'
+   STORED AS carbondata
    TBLPROPERTIES('PARTITION_TYPE'='LIST',
    'LIST_INFO'='aaaa, bbbb, (cccc, dddd), eeee')
   ```
@@ -914,7 +930,7 @@ Users can specify which columns to include and exclude for local dictionary gene
   ```
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
                     [(col_name data_type, ...)]
-  STORED BY 'carbondata'
+  STORED AS carbondata
   TBLPROPERTIES('BUCKETNUMBER'='noOfBuckets',
   'BUCKETCOLUMNS'='columnname')
   ```
@@ -926,27 +942,16 @@ Users can specify which columns to include and exclude for local dictionary gene
   Example:
   ```
   CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                productNumber INT,
-                                saleQuantity INT,
-                                productName STRING,
-                                storeCity STRING,
-                                storeProvince STRING,
-                                productCategory STRING,
-                                productBatch STRING,
-                                revenue INT)
-  STORED BY 'carbondata'
+    productNumber INT,
+    saleQuantity INT,
+    productName STRING,
+    storeCity STRING,
+    storeProvince STRING,
+    productCategory STRING,
+    productBatch STRING,
+    revenue INT)
+  STORED AS carbondata
   TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
   ```
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__docs').addClass('selected');
-
-  // Display docs subnav items
-  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) {
-    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>
 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/dml-of-carbondata.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/dml-of-carbondata.md b/src/site/markdown/dml-of-carbondata.md
index de23f5b..42da655 100644
--- a/src/site/markdown/dml-of-carbondata.md
+++ b/src/site/markdown/dml-of-carbondata.md
@@ -40,7 +40,31 @@ CarbonData DML statements are documented here,which includes:
   OPTIONS(property_name=property_value, ...)
   ```
 
-  **Supported Properties:** [DELIMITER](#delimiter), [QUOTECHAR](#quotechar), [COMMENTCHAR](#commentchar), [HEADER](#header), [FILEHEADER](#fileheader), [MULTILINE](#multiline), [ESCAPECHAR](#escapechar), [SKIP_EMPTY_LINE](#skip_empty_line), [COMPLEX_DELIMITER_LEVEL_1](#complex_delimiter_level_1), [COMPLEX_DELIMITER_LEVEL_2](#complex_delimiter_level_2), [ALL_DICTIONARY_PATH](#all_dictionary_path), [COLUMNDICT](#columndict), [DATEFORMAT](#dateformat),[ TIMESTAMPFORMAT](#timestampformat), [SORT_COLUMN_BOUNDS](#sort-column-bounds), [SINGLE_PASS](#single_pass), [BAD_RECORDS_LOGGER_ENABLE](#bad-records-handling), [BAD_RECORD_PATH](#bad-records-handling), [BAD_RECORDS_ACTION](#bad-records-handling), [IS_EMPTY_DATA_BAD_RECORD](#bad-records-handling), [GLOBAL_SORT_PARTITIONS](#global_sort_partitions)
+  **Supported Properties:**
+
+| Property                                                | Description                                                  |
+| ------------------------------------------------------- | ------------------------------------------------------------ |
+| [DELIMITER](#delimiter)                                 | Character used to separate the data in the input csv file    |
+| [QUOTECHAR](#quotechar)                                 | Character used to quote the data in the input csv file       |
+| [COMMENTCHAR](#commentchar)                             | Character used to comment the rows in the input csv file.Those rows will be skipped from processing |
+| [HEADER](#header)                                       | Whether the input csv files have header row                  |
+| [FILEHEADER](#fileheader)                               | If header is not present in the input csv, what is the column names to be used for data read from input csv |
+| [MULTILINE](#multiline)                                 | Whether a row data can span across multiple lines.           |
+| [ESCAPECHAR](#escapechar)                               | Escape character used to excape the data in input csv file.For eg.,\ is a standard escape character |
+| [SKIP_EMPTY_LINE](#skip_empty_line)                     | Whether empty lines in input csv file should be skipped or loaded as null row |
+| [COMPLEX_DELIMITER_LEVEL_1](#complex_delimiter_level_1) | Starting delimiter for complex type data in input csv file   |
+| [COMPLEX_DELIMITER_LEVEL_2](#complex_delimiter_level_2) | Ending delimiter for complex type data in input csv file     |
+| [ALL_DICTIONARY_PATH](#all_dictionary_path)             | Path to read the dictionary data from all columns            |
+| [COLUMNDICT](#columndict)                               | Path to read the dictionary data from for particular column  |
+| [DATEFORMAT](#dateformat)                               | Format of date in the input csv file                         |
+| [TIMESTAMPFORMAT](#timestampformat)                     | Format of timestamp in the input csv file                    |
+| [SORT_COLUMN_BOUNDS](#sort-column-bounds)               | How to parititon the sort columns to make the evenly distributed |
+| [SINGLE_PASS](#single_pass)                             | When to enable single pass data loading                      |
+| [BAD_RECORDS_LOGGER_ENABLE](#bad-records-handling)      | Whether to enable bad records logging                        |
+| [BAD_RECORD_PATH](#bad-records-handling)                | Bad records logging path.Useful when bad record logging is enabled |
+| [BAD_RECORDS_ACTION](#bad-records-handling)             | Behavior of data loading when bad record is found            |
+| [IS_EMPTY_DATA_BAD_RECORD](#bad-records-handling)       | Whether empty data of a column to be considered as bad record or not |
+| [GLOBAL_SORT_PARTITIONS](#global_sort_partitions)       | Number of partition to use for shuffling of data during sorting |
 
   You can use the following options to load data:
 
@@ -443,14 +467,3 @@ CarbonData DML statements are documented here,which includes:
   CLEAN FILES FOR TABLE carbon_table
   ```
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__docs').addClass('selected');
-
-  // Display docs subnav items
-  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) {
-    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/documentation.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/documentation.md b/src/site/markdown/documentation.md
index 66493bd..537a9d3 100644
--- a/src/site/markdown/documentation.md
+++ b/src/site/markdown/documentation.md
@@ -25,7 +25,7 @@ Apache CarbonData is a new big data file format for faster interactive query usi
 
 ## Getting Started
 
-**File Format Concepts:** Start with the basics of understanding the [CarbonData file format](./file-structure-of-carbondata.md#carbondata-file-structure) and its storage structure.This will help to understand other parts of the documentation, incuding deployment, programming and usage guides. 
+**File Format Concepts:** Start with the basics of understanding the [CarbonData file format](./file-structure-of-carbondata.md#carbondata-file-format) and its [storage structure](./file-structure-of-carbondata.md).This will help to understand other parts of the documentation, including deployment, programming and usage guides. 
 
 **Quick Start:** [Run an example program](./quick-start-guide.md#installing-and-configuring-carbondata-to-run-locally-with-spark-shell) on your local machine or [study some examples](https://github.com/apache/carbondata/tree/master/examples/spark2/src/main/scala/org/apache/carbondata/examples).
 
@@ -35,9 +35,9 @@ Apache CarbonData is a new big data file format for faster interactive query usi
 
 
 
-## Deployment
+## Integration
 
-CarbonData can be integrated with popular Execution engines like [Spark](./quick-start-guide.md#spark) and [Presto](./quick-start-guide.md#presto).Refer to the [Installation and Configuration](./quick-start-guide.md##deployment-modes) section to understand all modes of Integrating CarbonData.
+CarbonData can be integrated with popular Execution engines like [Spark](./quick-start-guide.md#spark) and [Presto](./quick-start-guide.md#presto).Refer to the [Installation and Configuration](./quick-start-guide.md#integration) section to understand all modes of Integrating CarbonData.
 
 
 
@@ -64,7 +64,3 @@ faster data format.Contributing to CarbonData doesn’t just mean writing code.
 
 **Trainings:** Training records on design and code flows can be found [here](https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+Training+Materials).
 
-<script>
-// Show selected style on nav item
-$(function() { $('.b-nav__intro').addClass('selected'); });
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/faq.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/faq.md b/src/site/markdown/faq.md
index fdc2ca6..8ec7290 100644
--- a/src/site/markdown/faq.md
+++ b/src/site/markdown/faq.md
@@ -26,18 +26,18 @@
 * [How to resolve Abstract Method Error?](#how-to-resolve-abstract-method-error)
 * [How Carbon will behave when execute insert operation in abnormal scenarios?](#how-carbon-will-behave-when-execute-insert-operation-in-abnormal-scenarios)
 * [Why aggregate query is not fetching data from aggregate table?](#why-aggregate-query-is-not-fetching-data-from-aggregate-table)
-* [Why all executors are showing success in Spark UI even after Dataload command failed at Driver side?](#Why-all-executors-are-showing-success-in-Spark-UI-even-after-Dataload-command-failed-at-driver-side)
-* [Why different time zone result for select query output when query SDK writer output?](#Why-different-time-zone-result-for-select-query-output-when-query-SDK-writer-output)
+* [Why all executors are showing success in Spark UI even after Dataload command failed at Driver side?](#why-all-executors-are-showing-success-in-spark-ui-even-after-dataload-command-failed-at-driver-side)
+* [Why different time zone result for select query output when query SDK writer output?](#why-different-time-zone-result-for-select-query-output-when-query-sdk-writer-output)
 
 # TroubleShooting
 
-- [Getting tablestatus.lock issues When loading data](#Getting-tablestatus.lock-issues-When-loading-data)
+- [Getting tablestatus.lock issues When loading data](#getting-tablestatuslock-issues-when-loading-data)
 - [Failed to load thrift libraries](#failed-to-load-thrift-libraries)
 - [Failed to launch the Spark Shell](#failed-to-launch-the-spark-shell)
 - [Failed to execute load query on cluster](#failed-to-execute-load-query-on-cluster)
 - [Failed to execute insert query on cluster](#failed-to-execute-insert-query-on-cluster)
 - [Failed to connect to hiveuser with thrift](#failed-to-connect-to-hiveuser-with-thrift)
-- [Failed to read the metastore db during table](#failed-to-read-the-metastore-db-during-table)
+- [Failed to read the metastore db during table creation](#failed-to-read-the-metastore-db-during-table-creation)
 - [Failed to load data on the cluster](#failed-to-load-data-on-the-cluster)
 - [Failed to insert data on the cluster](#failed-to-insert-data-on-the-cluster)
 - [Failed to execute Concurrent Operations(Load,Insert,Update) on table by multiple workers](#failed-to-execute-concurrent-operations-on-table-by-multiple-workers)
@@ -98,7 +98,7 @@ The property carbon.lock.type configuration specifies the type of lock to be acq
 In order to build CarbonData project it is necessary to specify the spark profile. The spark profile sets the Spark Version. You need to specify the ``spark version`` while using Maven to build project.
 
 ## How Carbon will behave when execute insert operation in abnormal scenarios?
-Carbon support insert operation, you can refer to the syntax mentioned in [DML Operations on CarbonData](dml-operation-on-carbondata.md).
+Carbon support insert operation, you can refer to the syntax mentioned in [DML Operations on CarbonData](./dml-of-carbondata.md).
 First, create a source table in spark-sql and load data into this created table.
 
 ```
@@ -126,7 +126,7 @@ CREATE TABLE IF NOT EXISTS carbon_table(
 id String,
 city String,
 name String)
-STORED BY 'carbondata';
+STORED AS carbondata;
 ```
 
 ```
@@ -170,7 +170,7 @@ When SubQuery predicate is present in the query.
 Example:
 
 ```
-create table gdp21(cntry smallint, gdp double, y_year date) stored by 'carbondata';
+create table gdp21(cntry smallint, gdp double, y_year date) stored as carbondata;
 create datamap ag1 on table gdp21 using 'preaggregate' as select cntry, sum(gdp) from gdp21 group by cntry;
 select ctry from pop1 where ctry in (select cntry from gdp21 group by cntry);
 ```
@@ -181,7 +181,7 @@ When aggregate function along with 'in' filter.
 Example:
 
 ```
-create table gdp21(cntry smallint, gdp double, y_year date) stored by 'carbondata';
+create table gdp21(cntry smallint, gdp double, y_year date) stored as carbondata;
 create datamap ag1 on table gdp21 using 'preaggregate' as select cntry, sum(gdp) from gdp21 group by cntry;
 select cntry, sum(gdp) from gdp21 where cntry in (select ctry from pop1) group by cntry;
 ```
@@ -192,7 +192,7 @@ When aggregate function having 'join' with equal filter.
 Example:
 
 ```
-create table gdp21(cntry smallint, gdp double, y_year date) stored by 'carbondata';
+create table gdp21(cntry smallint, gdp double, y_year date) stored as carbondata;
 create datamap ag1 on table gdp21 using 'preaggregate' as select cntry, sum(gdp) from gdp21 group by cntry;
 select cntry,sum(gdp) from gdp21,pop1 where cntry=ctry group by cntry;
 ```
@@ -462,10 +462,3 @@ Note :  Refrain from using "mvn clean package" without specifying the profile.
   A single column that can be considered as dimension is mandatory for table creation.
 
 
-
-
-<script>
-// Show selected style on nav item
-$(function() { $('.b-nav__faq').addClass('selected'); });
-</script>
-

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/file-structure-of-carbondata.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/file-structure-of-carbondata.md b/src/site/markdown/file-structure-of-carbondata.md
index 80c91d5..2b43105 100644
--- a/src/site/markdown/file-structure-of-carbondata.md
+++ b/src/site/markdown/file-structure-of-carbondata.md
@@ -6,48 +6,173 @@
     (the "License"); you may not use this file except in compliance with 
     the License.  You may obtain a copy of the License at
 
-      http://www.apache.org/licenses/LICENSE-2.0
+```
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software 
+distributed under the License is distributed on an "AS IS" BASIS, 
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and 
+limitations under the License.
+```
 
-    Unless required by applicable law or agreed to in writing, software 
-    distributed under the License is distributed on an "AS IS" BASIS, 
-    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-    See the License for the specific language governing permissions and 
-    limitations under the License.
 -->
 
-# CarbonData File Structure
+# CarbonData table structure
 
 CarbonData files contain groups of data called blocklets, along with all required information like schema, offsets and indices etc, in a file header and footer, co-located in HDFS.
 
 The file footer can be read once to build the indices in memory, which can be utilized for optimizing the scans and processing for all subsequent queries.
 
-### Understanding CarbonData File Structure
-* Block : It would be as same as HDFS block, CarbonData creates one file for each data block, user can specify TABLE_BLOCKSIZE during creation table. Each file contains File Header, Blocklets and File Footer.
+This document describes the what a CarbonData table looks like in a HDFS directory, files written and content of each file.
+
+- [File Directory Structure](#file-directory-structure)
+
+- [File Content details](#file-content-details)
+  - [Schema file format](#schema-file-format)
+  - [CarbonData file format](#carbondata-file-format)
+    - [Blocklet format](#blocklet-format)
+      - [V1](#v1)
+      - [V2](#v2)
+      - [V3](#v3)
+    - [Footer format](#footer-format)
+  - [carbonindex file format](#carbonindex-file-format)
+  - [Dictionary file format](#dictionary-file-format)
+  - [tablestatus file format](#tablestatus-file-format)
+
+## File Directory Structure
+
+The CarbonData files are stored in the location specified by the ***carbon.storelocation*** configuration (configured in carbon.properties; if not configured, the default is ../carbon.store).
+
+  The file directory structure is as below: 
+
+![File Directory Structure](../../src/site/images/2-1_1.png)
+
+1. ModifiedTime.mdt records the timestamp of the metadata with the modification time attribute of the file. When the drop table and create table are used, the modification time of the file is updated.This is common to all databases and hence is kept in parallel to databases
+2. The **default** is the database name and contains the user tables.default is used when user doesn't specify any database name;else user configured database name will be the directory name. user_table is the table name.
+3. Metadata directory stores schema files, tablestatus and dictionary files (including .dict, .dictmeta and .sortindex). There are three types of metadata data information files.
+4. data and index files are stored under directory named **Fact**. The Fact directory has a Part0 partition directory, where 0 is the partition number.
+5. There is a Segment_0 directory under the Part0 directory, where 0 is the segment number.
+6. There are two types of files, carbondata and carbonindex, in the Segment_0 directory.
+
+
+
+## File Content details
+
+When the table is created, the user_table directory is generated, and a schema file is generated in the Metadata directory for recording the table structure.
+
+When loading data in batches, each batch loading generates a new segment directory. The scheduling tries to control a task processing data loading task on each node. Each task will generate multiple carbondata files and one carbonindex file.
+
+During  global dictionary generation, if the two-pass scheme is used, before the data is loaded, the corresponding dict, dictmeta and sortindex files are generated for each dictionary-encoded column, and partial dictionary files can be provided by the pre-define dictionary method to reduce the need. A dictionary-encoded column is generated by scanning the full amount of data; a dictionary file of all dictionary code columns can also be provided by the all dictionary method to avoid scanning data. If the single-pass scheme is adopted, the global dictionary code is generated in real time during data loading, and after the data is loaded, the dictionary is solidified into a dictionary file.
+
+The following sections use the Java object generated by the thrift file describing the carbondata file format to explain the contents of each file one by one (you can also directly read the format defined in the [thrift file](https://github.com/apache/carbondata/tree/master/format/src/main/thrift))
+
+### Schema file format
+
+The contents of the schema file is as shown below
+
+![Schema file format](../../src/site/images/2-2_1.png)
+
+1. TableSchema class
+    The TableSchema class does not store the table name, it is infered from the directory name(user_table).
+    tableProperties is used to record table-related properties, such as: table_blocksize.
+2. ColumnSchema class
+    Encoders are used to record the encoding used in column storage.
+    columnProperties is used to record column related properties.
+3. BucketingInfo class
+    When creating a bucket table, you can specify the number of buckets in the table and the column to splitbuckets.
+4. DataType class
+    Describes the data types supported by CarbonData.
+5. Encoding class
+    Several encodings that may be used in CarbonData files.
+
+### CarbonData file format
+
+#### File Header
+
+It contains CarbonData file version number, list of column schema and schema updation timestamp.
+
+![File Header](../../src/site/images/carbon_data_file_structure_new.png)
+
+The carbondata file consists of multiple blocklets and footer parts. The blocklet is the dataset inside the carbondata file (the latest V3 format, the default configuration is 64MB), each blocklet contains a ColumnChunk for each column, and a ColumnChunk may contain one or more Column Pages.
+
+The carbondata file currently supports V1, V2 and V3 versions. The main difference is the change of the blocklet part, which is introduced one by one.
+
+#### Blocklet format
+
+#####  V1
+
+ Blocket consists of all column data pages, RLE pages, and rowID pages. Since the pages in the blocklet are grouped according to the page type, the three pieces of data of each column are distributed and stored in the blocklet, and the offset and length information of all the pages need to be recorded in the footer part.
+
+![V1](../../src/site/images/2-3_1.png)
+
+##### V2
+
+The blocklet consists of ColumnChunk for all columns. The ColumnChunk for a column consists of a ColumnPage, which includes the data chunk header, data page, RLE page, and rowID page. Since ColumnChunk aggregates the three types of Page data of the column together, it can read the column data using fewer readers. Since the header part records the length information of all the pages, the footer part only needs to record the offset and length of the ColumnChunk, and also reduces the amount of footer data.
+
+![V2](../../src/site/images/2-3_2.png)
+
+##### V3
+
+The blocklet is also composed of ColumnChunks of all columns. What is changed is that a ColumnChunk consists of one or more Column Pages, and Column Page adds a new BlockletMinMaxIndex.
+
+Compared with V2: The blocklet data volume of V2 format defaults to 120,000 lines, and the blocklet data volume of V3 format defaults to 64MB. For the same size data file, the information of the footer part index metadata may be further reduced; meanwhile, the V3 format adds a new page. Level data filtering, and the amount of data per page is only 32,000 lines by default, which is much less than the 120,000 lines of V2 format. The accuracy of data filtering hits further, and more data can be filtered out before decompressing data.
+
+![V3](../../src/site/images/2-3_3.png)
+
+#### Footer format
+
+Footer records each carbondata
+All blocklet data distribution information and statistical related metadata information (minmax, startkey/endkey) inside the file.
+
+![Footer format](../../src/site/images/2-3_4.png)
+
+1.  BlockletInfo3 is used to record the offset and length of all ColumnChunk3.
+2.  SegmentInfo is used to record the number of columns and the cardinality of each column.
+3.  BlockletIndex includes BlockletMinMaxIndex and BlockletBTreeIndex.
+
+BlockletBTreeIndex is used to record the startkey/endkey of all blocklets in the block. When querying, the startkey/endkey of the query is generated by filtering conditions combined with mdkey. With BlocketBtreeIndex, the range of blocklets satisfying the conditions in each block can be delineated.
+
+BlockletMinMaxIndex is used to record the min/max value of all columns in the blocklet. By using the min/max check on the filter condition, you can skip the block/blocklet that does not satisfy the condition.
+
+### carbonindex file format
+
+Extract the BlockletIndex part of the footer part to generate the carbonindex file. Load data in batches, schedule as much as possible to control a node to start a task, each task generates multiple carbondata files and a carbonindex file. The carbonindex file records the index information of all the blocklets in all the carbondata files generated by the task.
+
+As shown in the figure, the index information corresponding to a block is recorded by a BlockIndex object, including carbondata filename, footer offset and BlockletIndex. The BlockIndex data volume is less than the footer. The file is directly used to build the index on the driver side when querying, without having to skip the footer part of the data volume of multiple data files.
+
+![carbonindex file format](../../src/site/images/2-4_1.png)
+
+### Dictionary file format
+
+
+For each dictionary encoded column, a dictionary file is used to store the dictionary metadata for that column.
+
+1. dict file records the distinct value list of a column
+
+For the first time dataloading, the file is generated using a distinct value list of a column. The value in the file is unordered; the subsequent append is used. In the second step of dataloading (Data Convert Step), the dictionary code column will replace the true value of the data with the dictionary key.
+
+![Dictionary file format](../../src/site/images/2-5_1.png)
+
+
+2.  dictmeta records the metadata description of the new distinct value of each dataloading
+
+The dictionary cache uses this information to incrementally flush the cache.
+
+![Dictionary Chunk](../../src/site/images/2-5_2.png)
+	
+
+3.  sortindex records the result set of the key code of the dictionary code sorted by value.
 
-![CarbonData File Structure](../../src/site/images/carbon_data_file_structure_new.png)
+In dataLoading, if there is a new dictionary value, the sortindex file will be regenerated using all the dictionary codes.
 
-* File Header : It contains CarbonData file version number, list of column schema and schema updation timestamp.
-* File Footer : it contains Number of rows, segmentinfo ,all blocklets’ info and index, you can find the detail from the below diagram.
-* Blocklet : Rows are grouped to form a blocklet, the size of the blocklet is configurable and default size is 64MB, Blocklet contains Column Page groups for each column.
-* Column Page Group : Data of one column and it is further divided into pages, it is guaranteed to be contiguous in file.
-* Page : It has the data of one column and the number of row is fixed to 32000 size.
+Filtering queries based on dictionary code columns need to convert the value filter filter to the key filter condition. Using the sortindex file, you can quickly construct an ordered value sequence to quickly find the key value corresponding to the value, thus speeding up the conversion process.
 
-![CarbonData File Format](../../src/site/images/carbon_data_format_new.png)
+![sortindex file format](../../src/site/images/2-5_3.png)
 
-### Each page contains three types of data
-* Data Page: Contains the encoded data of a column of columns.
-* Row ID Page (optional): Contains the row ID mappings used when the data page is stored as an inverted index.
-* RLE Page (optional): Contains additional metadata used when the data page is RLE coded.
+### tablestatus file format
 
+Tablestatus records the segment-related information (in gson format) for each load and merge, including load time, load status, segment name, whether it was deleted, and the segment name incorporated. Regenerate the tablestatusfile after each load or merge.
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__docs').addClass('selected');
+![tablestatus file format](../../src/site/images/2-6_1.png)
 
-  // Display docs subnav items
-  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) {
-    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/how-to-contribute-to-apache-carbondata.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/how-to-contribute-to-apache-carbondata.md b/src/site/markdown/how-to-contribute-to-apache-carbondata.md
index 741e6d6..f64c948 100644
--- a/src/site/markdown/how-to-contribute-to-apache-carbondata.md
+++ b/src/site/markdown/how-to-contribute-to-apache-carbondata.md
@@ -189,11 +189,4 @@ From another local branch, run:
 $ git fetch --all
 $ git branch -d <my-branch>
 $ git push <GitHub_user> --delete <my-branch>
-```
-
-
-<script>
-// Show selected style on nav item
-$(function() { $('.b-nav__contri').addClass('selected'); });
-</script>
-
+```
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/introduction.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/introduction.md b/src/site/markdown/introduction.md
index 8169958..434ccfa 100644
--- a/src/site/markdown/introduction.md
+++ b/src/site/markdown/introduction.md
@@ -16,157 +16,102 @@ CarbonData has
 
 - **Multi level indexing** to efficiently prune the files and data to be scanned and hence reduce I/O scans and CPU processing
 
+## CarbonData Features & Functions
 
+CarbonData has rich set of featues to support various use cases in Big Data analytics.The below table lists the major features supported by CarbonData.
 
-## Architecture
 
-![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_architecture.png)
 
+### Table Management
 
+- ##### DDL (Create, Alter,Drop,CTAS)
 
-#### Spark Interface Layer: 
+​	CarbonData provides its own DDL to create and manage carbondata tables.These DDL conform to 			Hive,Spark SQL format and support additional properties and configuration to take advantages of CarbonData functionalities.
 
-CarbonData has deep integration with Apache Spark.CarbonData integrates custom Parser,Strategies,Optimization rules into Spark to take advantage of computing performed closer to data.
+- ##### DML(Load,Insert)
 
-![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_spark_integration.png)
+  CarbonData provides its own DML to manage data in carbondata tables.It adds many customizations through configurations to completely customize the behavior as per user requirement scenarios.
 
-1. **Carbon parser** Enhances Spark’s SQL parser to support Carbon specific DDL and DML commands to create carbon table, create aggregate tables, manage data loading, data retention and cleanup.
-2. **Carbon Strategies**:- Modify Spark SQL’s physical query execution plan to push down possible operations to Carbon for example:- Grouping, Distinct Count, Top N etc.. for improving query performance.
-3. **Carbon Data RDD**:- Makes the data present in Carbon tables visible to Spark as a RDD which enables spark to perform distributed computation on Carbon tables.
+- ##### Update and Delete
 
+  CarbonData supports Update and Delete on Big Data.CarbonData provides the syntax similar to Hive to support IUD operations on CarbonData tables.
 
+- ##### Segment Management
 
-#### Carbon Processor: 
+  CarbonData has unique concept of segments to manage incremental loads to CarbonData tables effectively.Segment management helps to easily control the table, perform easy retention, and is also used to provide transaction capability for operations being performed.
 
-Receives a query execution fragment from spark and executes the same on the Carbon storage. This involves Scanning the carbon store files for matching record, using the indices to directly locate the row sets and even the rows that may containing the data being searched for. The Carbon processor also performs all pushed down operations such as 
+- ##### Partition
 
-Aggregation/Group By
+  CarbonData supports 2 kinds of partitions.1.partition similar to hive partition.2.CarbonData partition supporting hash,list,range partitioning.
 
-Distinct Count
+- ##### Compaction
 
-Top N
+  CarbonData manages incremental loads as segments.Compaction help to compact the growing number of segments and also to improve query filter pruning.
 
-Expression Evaluation
+- ##### External Tables
 
-And many more…
+  CarbonData can read any carbondata file and automatically infer schema from the file and provide a relational table view to perform sql queries using Spark or any other applicaion.
 
-#### Carbon Storage:
+### DataMaps
 
-Custom columnar data store which is heavily compressed, binary, dictionary encoded and heavily indexed.Usaually stored in HDFS.
+- ##### Pre-Aggregate
 
-## CarbonData Features
+  CarbonData has concept of datamaps to assist in pruning of data while querying so that performance is faster.Pre Aggregate tables are kind of datamaps which can improve the query performance by order of magnitude.CarbonData will automatically pre-aggregae the incremental data and re-write the query to automatically fetch from the most appropriate pre-aggregate table to serve the query faster.
 
-CarbonData has rich set of featues to support various use cases in Big Data analytics.
+- ##### Time Series
 
- 
+  CarbonData has built in understanding of time order(Year, month,day,hour, minute,second).Time series is a pre-aggregate table which can automatically roll-up the data to the desired level during incremental load and serve the query from the most appropriate pre-aggregate table.
 
-## Design
+- ##### Bloom filter
 
-- ### Dictionary Encoding
+  CarbonData supports bloom filter as a datamap in order to quickly and efficiently prune the data for scanning and acheive faster query performance.
 
-CarbonData supports encoding of data with suggogate values to reduce storage space and speed up processing.Most databases and big data SQL data stores adopt dictionary encoding(integer surrogate numbers) to achieve data compression.Unlike other column store databases where the dictionary is local to each data block, CarbonData maintains a global dictionary which provides opportunity for lazy conversion to actual values enabling all computation to be performed on the lightweight surrogate values.
+- ##### Lucene
 
-##### Dictionary generation
+  Lucene is popular for indexing text data which are long.CarbonData provides a lucene datamap so that text columns can be indexed using lucene and use the index result for efficient pruning of data to be retrieved during query.
 
-![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_dict_encoding.png)
+- ##### MV (Materialized Views)
 
+  MVs are kind of pre-aggregate tables which can support efficent query re-write and processing.CarbonData provides MV which can rewrite query to fetch from any table(including non-carbondata tables).Typical usecase is to store the aggregated data of a non-carbondata fact table into carbondata and use mv to rewrite the query to fetch from carbondata.
 
+### Streaming
 
-##### MDK Indexing
+- ##### Spark Streaming
 
-All the surrogate keys are byte packed to generate an MDK (Multi Dimensional Key) Index.
+  CarbonData supports streaming of data into carbondata in near-realtime and make it immediately available for query.CarbonData provides a DSL to create source and sink tables easily without the need for the user to write his application.
 
-Any non surrogate columns of String data types are compressed using one of the configured compression algorithms and stored.For those numeric columns where surrogates are not generated, such data is stored as it is after compression.
+### SDK
 
-![image-20180903212418381](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_mdk.png)
+- ##### CarbonData writer
 
-##### Sorted MDK
+  CarbonData supports writing data from non-spark application using SDK.Users can use SDK to generate carbondata files from custom applications.Typical usecase is to write the streaming application plugged in to kafka and use carbondata as sink(target) table for storing.
 
-The data is sorted based on the MDK Index.Sorting helps for logical grouping of similar data and there by aids in faster look up during query.
+- ##### CarbonData reader
 
-#### ![image-20180903212525214](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_mdk_sort.png)
+  CarbonData supports reading of data from non-spark application using SDK.Users can use the SDK to read the carbondata files from their application and do custom processing.
 
-##### Custom Columnar Encoding
+### Storage
 
-The Sorted MDK Index is split into each column.Unlike other stores where the column is compressed and stored as it is, CarbonData sorts this column data so that Binary Search can be performed on individual column data based on the filter conditions.This aids in magnitude increase in query performance and also in better compression.Since the individual column's data gets sorted, it is necessary to maintain the row mapping with the sorted MDK Index data in order to retrieve data from other columns which are not participating in filter.This row mapping is termed as **Inverted Index** and is stored along with the column data.The below picture depicts the logical column view.User has the option to **turn off** Inverted Index for such columns where filters are never applied or is very rare.In such cases, scanning would be sequential, but can aid in reducing the storage size(occupied due to inverted index data).
+- ##### S3
 
-#### ![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_blocklet_view.png)
+  CarbonData can write to S3, OBS or any cloud storage confirming to S3 protocol.CarbonData uses the HDFS api to write to cloud object stores.
 
-- ### CarbonData Storage Format
+- ##### HDFS
 
-  CarbonData has a unique storage structure which aids in efficient storage and retrieval of data.Please refer to [File Structure of CarbonData](#./file-structure-of-carbondata.md) for detailed information on the format.
-
-- ### Indexing
-
-  CarbonData maintains multiple indexes at multiple levels to assist in efficient pruning of unwanted data from scan during query.Also CarbonData has support for plugging in external indexing solutions to speed up the query process.
-
-  ##### Min-Max Indexing
-
-  Storing data along with index significantly accelerates query performance and reduces the I/O scans and CPU resources in case of filters in the query. CarbonData index consists of multiple levels of indices, a processing framework can leverage this index to reduce the number of tasks it needs to schedule and process. It can also do skip scan in more fine grained units (called blocklet) in task side scanning instead of scanning the whole file.  **CarbonData maintains Min-Max Index for all the columns.**
-
-  CarbonData maintains a separate index file which contains the footer information for efficient IO reads.
-
-  Using the Min-Max info in these index files, two levels of filtering can be achieved.
-
-  Min-Max at the carbondata file level,to efficiently prune the files when the filter condition doesn't fall in the range.This information when maintained at the Spark Driver, will help to efficiently schedule the tasks for scanning
-
-  Min-Max at the blocklet level, to efficiently prune the blocklets when the filter condition doesn't fall in the range.This information when maintained at the executor can significantly reduce the amount unnecessary data processed by the executor tasks. 
-
-
-
-  ![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata-minmax-blocklet.png)
-
-- #### DataMaps
-
-  DataMap is a framework for indexing and also for statistics that can be used to add primary index (Blocklet Index) , secondary index type and statistical type to CarbonData.
-
-  DataMap is a standardized general interface which CarbonData uses to prune data blocks for scanning.
-
-  DataMaps are of 2 types:
-
-  **CG(Coarse Grained) DataMaps** Can prune data to the blocklet or to Page level.ie., Holds information for deciding which blocks/blocklets to be scanned.This DataMap is used in Spark Driver to decide the number of tasks to be scheduled.
-
-  **FG(Fine Grained) DataMaps** Can prune data to row level.This DataMap is used in Spark executor for scanning an fetching the data much faster.
-
-  Since DataMap interfaces are generalised, We can write a thin adaptor called as **DataMap Providers** to interface between CarbonData and other external Indexing engines. For eg., Lucene, Solr,ES,...
-
-  CarbonData has its own DSL to create and manage DataMaps.Please refer to [CarbonData DSL](#./datamap/datamap-management.md#overview) for more information.
-
-  The below diagram explains about the DataMap execution in CarbonData.
-
-  ![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata-datamap.png)
-
-- #### Update & Delete
-
-
-CarbonData supports Update and delete operations over big data.This functionality is not targetted for OLTP scenarios where high concurrent update/delete are required.Following are the assumptions considered when this feature is designed.
-
-1. Updates or Deletes are periodic and in Bulk
-2. Updates or Deletes are atomic
-3. Data is immediately visible
-4. Concurrent query to be allowed during an update or delete operation
-5. Single statement auto-commit support (not OLTP-style transaction)
-
-Since data stored in HDFS are immutable,data blocks cannot be updated in-place.Re-write of entire data block is not efficient for IO and also is a slow process.
-
-To over come these limitations, CarbonData adopts methodology of writing a delta file containing the rows to be deleted and another delta file containing the values to be updated with.During processing, These two delta files are merged with the main carbondata file and the correct result is returned for the query.
-
-The below diagram describes the process.
-
-![](/Users/aditi_advith/Documents/code/carbondata/docs/images/carbondata_update_delete.png)
+  CarbonData uses HDFS api to write and read data from HDFS.CarbonData can take advantage of the locality information to efficiently suggest spark to run tasks near to the data.
 
 
 
 ## Integration with Big Data ecosystem
 
-Refer to Integration with [Spark](#./quick-start-guide.md#spark), [Presto](#./quick-start-guide.md#presto) for detailed information on integrating CarbonData with these execution engines.
+Refer to Integration with [Spark](./quick-start-guide.md#spark), [Presto](./quick-start-guide.md#presto) for detailed information on integrating CarbonData with these execution engines.
 
 ## Scenarios where CarbonData is suitable
 
+CarbonData is useful in various analytical work loads.Some of the most typical usecases where CarbonData is being used is [documented here](./usecases.md).
+
 
 
+## Performance Results
 
-<script>
-// Show selected style on nav item
-$(function() { $('.b-nav__intro').addClass('selected'); });
-</script>
\ No newline at end of file
+![Performance Results](../docs/images/carbondata-performance.png?raw=true)

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/language-manual.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/language-manual.md b/src/site/markdown/language-manual.md
index 9fef71b..123cae3 100644
--- a/src/site/markdown/language-manual.md
+++ b/src/site/markdown/language-manual.md
@@ -24,12 +24,11 @@ CarbonData has its own parser, in addition to Spark's SQL Parser, to parse and p
 - [Data Types](./supported-data-types-in-carbondata.md)
 - Data Definition Statements
   - [DDL:](./ddl-of-carbondata.md)[Create](./ddl-of-carbondata.md#create-table),[Drop](./ddl-of-carbondata.md#drop-table),[Partition](./ddl-of-carbondata.md#partition),[Bucketing](./ddl-of-carbondata.md#bucketing),[Alter](./ddl-of-carbondata.md#alter-table),[CTAS](./ddl-of-carbondata.md#create-table-as-select),[External Table](./ddl-of-carbondata.md#create-external-table)
-  - Indexes
-  - [DataMaps](./datamap-management.md)
-    - [Bloom](./bloomfilter-datamap-guide.md)
-    - [Lucene](./lucene-datamap-guide.md)
-    - [Pre-Aggregate](./preaggregate-datamap-guide.md)
-    - [Time Series](./timeseries-datamap-guide.md)
+  - [DataMaps](./datamap/datamap-management.md)
+    - [Bloom](./datamap/bloomfilter-datamap-guide.md)
+    - [Lucene](./datamap/lucene-datamap-guide.md)
+    - [Pre-Aggregate](./datamap/preaggregate-datamap-guide.md)
+    - [Time Series](./datamap/timeseries-datamap-guide.md)
   - Materialized Views (MV)
   - [Streaming](./streaming-guide.md)
 - Data Manipulation Statements
@@ -37,15 +36,4 @@ CarbonData has its own parser, in addition to Spark's SQL Parser, to parse and p
   - [Segment Management](./segment-management-on-carbondata.md)
 - [Configuration Properties](./configuration-parameters.md)
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__docs').addClass('selected');
-
-  // Display docs subnav items
-  if (!$('.b-nav__docs').parent().hasClass('nav__item__with__subs--expanded')) {
-    $('.b-nav__docs').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>
 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/lucene-datamap-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/lucene-datamap-guide.md b/src/site/markdown/lucene-datamap-guide.md
index 248c8e5..86b00e2 100644
--- a/src/site/markdown/lucene-datamap-guide.md
+++ b/src/site/markdown/lucene-datamap-guide.md
@@ -59,7 +59,7 @@ It will show all DataMaps created on main table.
     age int,
     city string,
     country string)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   ```
   
   User can create Lucene datamap using the Create DataMap DDL:
@@ -149,7 +149,7 @@ select * from datamap_test where TEXT_MATCH('name:*n*')
 
 select * from datamap_test where TEXT_MATCH('name:*10 -name:*n*')
 ```
-**Note:** For lucene queries and syntax, refer to [lucene-syntax](www.lucenetutorial.com/lucene-query-syntax.html)
+**Note:** For lucene queries and syntax, refer to [lucene-syntax](http://www.lucenetutorial.com/lucene-query-syntax.html)
 
 ## Data Management with lucene datamap
 Once there is lucene datamap is created on the main table, following command on the main
@@ -173,15 +173,4 @@ release, user can do as following:
 3. Create the lucene datamap again by `CREATE DATAMAP` command.
 Basically, user can manually trigger the operation by re-building the datamap.
 
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__datamap').addClass('selected');
-  
-  if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
-    // Display datamap subnav items
-    $('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>
 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/performance-tuning.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/performance-tuning.md b/src/site/markdown/performance-tuning.md
index d8b53f2..f56a63b 100644
--- a/src/site/markdown/performance-tuning.md
+++ b/src/site/markdown/performance-tuning.md
@@ -22,6 +22,7 @@
   * [Suggestions to create CarbonData Table](#suggestions-to-create-carbondata-table)
   * [Configuration for Optimizing Data Loading performance for Massive Data](#configuration-for-optimizing-data-loading-performance-for-massive-data)
   * [Optimizing Query Performance](#configurations-for-optimizing-carbondata-performance)
+  * [Compaction Configurations for Optimizing CarbonData Query Performance](#compaction-configurations-for-optimizing-carbondata-query-performance)
 
 ## Suggestions to Create CarbonData Table
 
@@ -56,7 +57,7 @@
     counter_1, Decimal
     ...
     
-    )STORED BY 'carbondata'
+    )STORED AS carbondata
     TBLPROPERTIES ('SORT_COLUMNS'='msisdn, Dime_1')
   ```
 
@@ -81,7 +82,7 @@
       counter_1, Decimal
       ...
       
-      )STORED BY 'carbondata'
+      )STORED AS carbondata
       TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
   ```
 
@@ -100,7 +101,7 @@
     counter_1 decimal,
     counter_2 double,
     ...
-    )STORED BY 'carbondata'
+    )STORED AS carbondata
     TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
 ```
   The result of performance analysis of test-case shows reduction in query execution time from 15 to 3 seconds, thereby improving performance by nearly 5 times.
@@ -121,12 +122,12 @@
     END_TIME bigint,
     ...
     counter_100 double
-    )STORED BY 'carbondata'
+    )STORED AS carbondata
     TBLPROPERTIES ('SORT_COLUMNS'='Dime_1, HOST, MSISDN')
   ```
 
   **NOTE:**
-  + BloomFilter can be created to enhance performance for queries with precise equal/in conditions. You can find more information about it in BloomFilter datamap [document](https://github.com/apache/carbondata/blob/master/docs/datamap/bloomfilter-datamap-guide.md).
+  + BloomFilter can be created to enhance performance for queries with precise equal/in conditions. You can find more information about it in BloomFilter datamap [document](./datamap/bloomfilter-datamap-guide.md).
 
 
 ## Configuration for Optimizing Data Loading performance for Massive Data
@@ -176,8 +177,70 @@
 
   Note: If your CarbonData instance is provided only for query, you may specify the property 'spark.speculation=true' which is in conf directory of spark.
 
+## Compaction Configurations for Optimizing CarbonData Query Performance
+
+CarbonData provides many configurations to tune the compaction behavior so that query peformance is improved.
+
+
+
+Based on the number of cores available in the node, it is recommended to tune the configuration 	***carbon.number.of.cores.while.compacting*** appropriately.Configuring a higher value will improve the overall compaction performance.
+
+<p>&nbsp;</p>
+<table style="width: 777px;">
+<tbody>
+<tr style="height: 23px;">
+<td style="height: 23px; width: 95.375px;">No</td>
+<td style="height: 23px; width: 299.625px;">&nbsp;Data Loading frequency</td>
+<td style="height: 23px; width: 144px;">Data Size of each load</td>
+<td style="height: 23px; width: 204px;">Minor Compaction configuration</td>
+<td style="height: 23px; width: 197px;">&nbsp;Major compaction configuration</td>
+</tr>
+<tr style="height: 29.5px;">
+<td style="height: 29.5px; width: 95.375px;">1</td>
+<td style="height: 29.5px; width: 299.625px;">&nbsp;Batch(Once is several Hours)</td>
+<td style="height: 29.5px; width: 144px;">Big</td>
+<td style="height: 29.5px; width: 204px;">&nbsp;Not Suggested</td>
+<td style="height: 29.5px; width: 197px;">Configure Major Compaction size of 3-4 load size.Perform Major compaction once in a day</td>
+</tr>
+<tr style="height: 23px;">
+<td style="height: 23px; width: 95.375px;" rowspan="2">2</td>
+<td style="height: 23px; width: 299.625px;" rowspan="2">&nbsp;Batch(Once in few minutes)&nbsp;</td>
+<td style="height: 23px; width: 144px;">Big&nbsp;</td>
+<td style="height: 23px; width: 204px;">
+<p>&nbsp;Minor compaction (2,2).</p>
+<p>Enable Auto compaction, if high rate data loading speed is not required or the time between loads is sufficient to run the compaction</p>
+</td>
+<td style="height: 23px; width: 197px;">Major compaction size of 10 load size.Perform Major compaction once in a day</td>
+</tr>
+<tr style="height: 23px;">
+<td style="height: 23px; width: 144px;">Small</td>
+<td style="height: 23px; width: 204px;">
+<p>Minor compaction (6,6).</p>
+<p>Enable Auto compaction, if high rate data loading speed is not required or the time between loads is sufficient to run the compaction</p>
+</td>
+<td style="height: 23px; width: 197px;">Major compaction size of 10 load size.Perform Major compaction once in a day</td>
+</tr>
+<tr style="height: 23px;">
+<td style="height: 23px; width: 95.375px;">3</td>
+<td style="height: 23px; width: 299.625px;">&nbsp;History data loaded as single load,incremental loads matches&nbsp;(1) or (2)</td>
+<td style="height: 23px; width: 144px;">Big</td>
+<td style="height: 23px; width: 204px;">
+<p>&nbsp;Configure ALLOWED_COMPACTION_DAYS to exclude the History load.</p>
+<p>Configure Minor compaction configuration based&nbsp;condition (1) or (2)</p>
+</td>
+<td style="height: 23px; width: 197px;">&nbsp;Configure Major compaction size smaller than the history load size.</td>
+</tr>
+<tr style="height: 23px;">
+<td style="height: 23px; width: 95.375px;">4</td>
+<td style="height: 23px; width: 299.625px;">&nbsp;There can be error in recent data loaded.Need reload sometimes</td>
+<td style="height: 23px; width: 144px;">&nbsp;(1) or (2)</td>
+<td style="height: 23px; width: 204px;">
+<p>&nbsp;Configure COMPACTION_PRESERVE_SEGMENTS</p>
+<p>to exclude the recent few segments from compacting.</p>
+<p>Configure Minor compaction configuration based&nbsp;condition (1) or (2)</p>
+</td>
+<td style="height: 23px; width: 197px;">Same as (1) or (2)&nbsp;</td>
+</tr>
+</tbody>
+</table>
 
-<script>
-// Show selected style on nav item
-$(function() { $('.b-nav__perf').addClass('selected'); });
-</script>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/a51dc596/src/site/markdown/preaggregate-datamap-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/preaggregate-datamap-guide.md b/src/site/markdown/preaggregate-datamap-guide.md
index 9c7a5f8..3a3efc2 100644
--- a/src/site/markdown/preaggregate-datamap-guide.md
+++ b/src/site/markdown/preaggregate-datamap-guide.md
@@ -64,7 +64,7 @@ Start spark-shell in new terminal, type :paste, then copy and run the following
       | country string,
       | quantity int,
       | price bigint)
-      | STORED BY 'carbondata'
+      | STORED AS carbondata
     """.stripMargin)
  
  // Create pre-aggregate table on the main table
@@ -162,7 +162,7 @@ It will show all DataMaps created on main table.
     country string,
     quantity int,
     price bigint)
-  STORED BY 'carbondata'
+  STORED AS carbondata
   ```
   
   User can create pre-aggregate tables using the Create DataMap DDL
@@ -270,15 +270,3 @@ release, user can do as following:
 3. Create the pre-aggregate table again by `CREATE DATAMAP` command
 Basically, user can manually trigger the operation by re-building the datamap.
 
-
-<script>
-$(function() {
-  // Show selected style on nav item
-  $('.b-nav__datamap').addClass('selected');
-  
-  if (!$('.b-nav__datamap').parent().hasClass('nav__item__with__subs--expanded')) {
-    // Display datamap subnav items
-    $('.b-nav__datamap').parent().toggleClass('nav__item__with__subs--expanded');
-  }
-});
-</script>


Mime
View raw message