mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAHOUT-245) Better handling of Categorical attributes when building Decision Forests
Date Sun, 24 Jan 2010 18:06:17 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804272#action_12804272
] 

Sean Owen commented on MAHOUT-245:
----------------------------------

Can I commit this? any objection?

> Better handling of Categorical attributes when building Decision Forests
> ------------------------------------------------------------------------
>
>                 Key: MAHOUT-245
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-245
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>    Affects Versions: 0.3
>            Reporter: Deneche A. Hakim
>            Assignee: Deneche A. Hakim
>             Fix For: 0.3
>
>         Attachments: mahout-245.patch
>
>
> When building a decision tree, at each node a random subset from all variables (attributes)
is considered for the node split.
> If a Categorical variable has been selected, the data available at the node is split
such that each child node has the same value for the selected variable. In all sub-nodes the
selected variable should not be selected again, but the current implementation does not account
for that. The resulting tree may contain redundant nodes that does not impair its classification
performance but are nonetheless unnecessary.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message