spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From m...@apache.org
Subject git commit: Fix typo in decision tree docs
Date Tue, 19 Aug 2014 04:43:36 GMT
Repository: spark
Updated Branches:
  refs/heads/master 82577339d -> cd0720ca7


Fix typo in decision tree docs

Candidate splits were inconsistent with the example.

Author: Matt Forbes <matt@tellapart.com>

Closes #1837 from emef/tree-doc and squashes the following commits:

3be14a1 [Matt Forbes] Fix typo in decision tree docs


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cd0720ca
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cd0720ca
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cd0720ca

Branch: refs/heads/master
Commit: cd0720ca77894d481fb73a8b5bb517013843cb1e
Parents: 8257733
Author: Matt Forbes <matt@tellapart.com>
Authored: Mon Aug 18 21:43:32 2014 -0700
Committer: Xiangrui Meng <meng@databricks.com>
Committed: Mon Aug 18 21:43:32 2014 -0700

----------------------------------------------------------------------
 docs/mllib-decision-tree.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/cd0720ca/docs/mllib-decision-tree.md
----------------------------------------------------------------------
diff --git a/docs/mllib-decision-tree.md b/docs/mllib-decision-tree.md
index 9cbd880..c01a92a 100644
--- a/docs/mllib-decision-tree.md
+++ b/docs/mllib-decision-tree.md
@@ -84,8 +84,8 @@ Section 9.2.4 in
 [Elements of Statistical Machine Learning](http://statweb.stanford.edu/~tibs/ElemStatLearn/)
for
 details). For example, for a binary classification problem with one categorical feature with
three
 categories A, B and C with corresponding proportion of label 1 as 0.2, 0.6 and 0.4, the categorical
-features are ordered as A followed by C followed B or A, B, C. The two split candidates are
A \| C, B
-and A , B \| C where \| denotes the split. A similar heuristic is used for multiclass classification
+features are ordered as A followed by C followed B or A, C, B. The two split candidates are
A \| C, B
+and A , C \| B where \| denotes the split. A similar heuristic is used for multiclass classification
 when `$2^(M-1)-1$` is greater than the number of bins -- the impurity for each categorical
feature value
 is used for ordering.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message