spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yh...@apache.org
Subject spark git commit: [SPARK-11678][SQL][DOCS] Document basePath in the programming guide.
Date Thu, 10 Dec 2015 02:09:52 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.6 d86a88da6 -> 9fe8dc916


[SPARK-11678][SQL][DOCS] Document basePath in the programming guide.

This PR adds document for `basePath`, which is a new parameter used by `HadoopFsRelation`.

The compiled doc is shown below.
![image](https://cloud.githubusercontent.com/assets/2072857/11673132/1ba01192-9dcb-11e5-98d9-ac0b4e92e98c.png)

JIRA: https://issues.apache.org/jira/browse/SPARK-11678

Author: Yin Huai <yhuai@databricks.com>

Closes #10211 from yhuai/basePathDoc.

(cherry picked from commit ac8cdf1cdc148bd21290ecf4d4f9874f8c87cc14)
Signed-off-by: Yin Huai <yhuai@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9fe8dc91
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9fe8dc91
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9fe8dc91

Branch: refs/heads/branch-1.6
Commit: 9fe8dc916e8a30914199b1fbb8c3765ba742559a
Parents: d86a88d
Author: Yin Huai <yhuai@databricks.com>
Authored: Wed Dec 9 18:09:36 2015 -0800
Committer: Yin Huai <yhuai@databricks.com>
Committed: Wed Dec 9 18:09:48 2015 -0800

----------------------------------------------------------------------
 docs/sql-programming-guide.md | 7 +++++++
 1 file changed, 7 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/9fe8dc91/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index 9f87acc..3f9a831 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1233,6 +1233,13 @@ infer the data types of the partitioning columns. For these use cases,
the autom
 can be configured by `spark.sql.sources.partitionColumnTypeInference.enabled`, which is default
to
 `true`. When type inference is disabled, string type will be used for the partitioning columns.
 
+Starting from Spark 1.6.0, partition discovery only finds partitions under the given paths
+by default. For the above example, if users pass `path/to/table/gender=male` to either 
+`SQLContext.read.parquet` or `SQLContext.read.load`, `gender` will not be considered as a
+partitioning column. If users need to specify the base path that partition discovery
+should start with, they can set `basePath` in the data source options. For example,
+when `path/to/table/gender=male` is the path of the data and
+users set `basePath` to `path/to/table/`, `gender` will be a partitioning column.
 
 ### Schema Merging
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message