hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Makoto Yui <yuin...@gmail.com>
Subject [ANN] Hivemall v0.3.2 is now available
Date Wed, 10 Jun 2015 07:37:35 GMT
Hello all,

We released a newer version of Hivemall, v0.3.2.

Hivemall provides machine learning functionality over Hive UDFs/UDAFs/UDTFs.
Hivemall is easy to use because every machine learning step is done
within HiveQL.

   https://github.com/myui/hivemall

In the latest release (v0.3.2), we introduced

   o Anomaly Detection using Local Outlier Factor, and
   o Polynomial features that is useful for non-linear
regression/classification.

Anomaly Detection in Hivemall [1] is very easy to use.

1) Just prepare a table (e.g., a table containing sensor data) as follows.

| rowid | features
|-------| ----------
| 1     | ["reflectance:0.5252967","specific_heat:0.19863537","weight:0.0"]
| 2     | ["reflectance:0.5950446","specific_heat:0.09166764","weight:0.052084323"]
| 3     | ["reflectance:0.6797837","specific_heat:0.12567581","weight:0.13255163"]
| 4     | ...

2) Run a query to find top-K outliers. Then, you can get outlier candidates.

| rowid | LOF value
| ----- | -------------
|  87   | 3.031143750623693  (<- rowid 87 is outlier is this case)
|  16   | 1.975556449228491
|  1    | 1.8415763677073722

Hope you enjoy the release! Feedback and pull requests are welcome.

Last but not least, we have changed the license of Hivemall from LGPL
v2 to Apache License v2 since v0.3.1.

[1] https://github.com/myui/hivemall/wiki/Outlier-Detection-using-Local-Outlier-Factor

Thanks,
Makoto

--
Makoto YUI
Research Engineer, Treasure Data, Inc.
http://myui.github.io/

Mime
View raw message