mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (MAHOUT-1500) H2O integration
Date Wed, 30 Jul 2014 14:55:39 GMT


ASF GitHub Bot commented on MAHOUT-1500:

Github user gcapan commented on the pull request:
    Tests pass for me for various profiles, and the code looks good. I am a supporter of engine-agnostic
architecture and separation of actual algorithms from backends, and multiple backends (in
addition both Spark and H2O being very promising platforms) would force us implement generic
solutions for data preprocessing, vectorization, machine learning and big data mining. In
summary, my vote is +1 for that contribution. 
    PS: Not H2O specific, but wanted to add here: I believe the next step should be standardizing
minimal Matrix I/O capability (i.e. a couple file formats other than [row_id, VectorWritable]
SequenceFiles) required for a distributed computation engine, and adding data frame like structures
those allow text columns.  

> H2O integration
> ---------------
>                 Key: MAHOUT-1500
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Anand Avati
>             Fix For: 1.0
> Provide H2O backend for the Mahout DSL

This message was sent by Atlassian JIRA

View raw message