ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Zinoviev (Jira)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets
Date Tue, 26 Nov 2019 17:07:00 GMT
Alexey Zinoviev created IGNITE-12396:
----------------------------------------

             Summary: [ML] Random Forest generates NaN for a part of models on small datasets
                 Key: IGNITE-12396
                 URL: https://issues.apache.org/jira/browse/IGNITE-12396
             Project: Ignite
          Issue Type: Bug
          Components: ml
    Affects Versions: 3.0
            Reporter: Alexey Zinoviev
            Assignee: Alexey Zinoviev
             Fix For: 3.0


@Override public Double predict(Vector features) {
 double[] predictions = new double[models.size()];

 for (int i = 0; i < models.size(); i++)
 predictions[i] = models.get(i).predict(features);

 return predictionsAggregator.apply(predictions);
}

 

predictionAggreagtor gets a lot of models and part of them returns null and it could be aggregated,
first of all handle this in Aggregator (using threshold for amount of broken models before
aggregation) also RandomForest trees should return Double.NaN - it should fail or throw message
after the training

 

I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows

 

RF generates a few models with one LEAF node with empty val (Double.NaN by default)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message