madlib-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank McQuillan <>
Subject Apache MADlib user survey results
Date Mon, 07 Nov 2016 19:55:41 GMT
We recently ran a survey asking MADlib users about a wide range of topics
pertaining to this open source project, including desired new features.
Thank you to all who responded.

You are welcome to view the survey results:
and make any comments or suggestions.

Quick summary:

* Received ~40 responses from 27 different companies
* ~50% of respondents have 1 year or less of MADlib use
* Fraud detection is the most common use case
* Regression (various), clustering and random forest are the most commonly
used MADlib algorithms
* Gradient boosting is the most commonly requested new algorithm
* Users prefer new algorithms more than improvements to existing algorithms
by a 2:1 margin
* Improved documentation/examples and better performance are the biggest
* The most common other tools used by respondents are R, Spark and Python
(and associated libraries)


View raw message