hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: [ANN] Hivemall: Hive scalable machine learning library
Date Fri, 11 Oct 2013 09:28:22 GMT
Just tried this for some hot trends in forum managements. Was pretty
impressive.

I will try this more deeply and if possible integrate in my product.

Thanks for the awesome work.

Nitin


On Fri, Oct 11, 2013 at 12:58 PM, Makoto YUI <yuin405@gmail.com> wrote:

> Hi,
>
> I added support for the-state-of-the-art classifiers (those are not yet
> supported in Mahout) and Hivemall's cute(!?) logo as well in Hivemall
> 0.1-rc3.
>
> Newly supported classifiers include
> - Confidence Weighted (CW)
> - Adaptive Regularization of Weight Vectors (AROW)
> - Soft Confidence Weighted (SCW1, SCW2)
>
> Those classifiers are much smart comparing to the standard SGD-based or
> passive aggressive classifiers. Please check it out by yourself.
>
> Thanks,
> Makoto
>
>
> (2013/10/11 4:28), Clark Yang (杨卓荦) wrote:
>
>> I looks really cool, I think I will try it on.
>>
>> Cheers,
>> Zhuoluo (Clark) Yang
>>
>>
>> 2013/10/5 Makoto YUI <yuin405@gmail.com <mailto:yuin405@gmail.com>>
>>
>>
>>     Hi Edward,
>>
>>     Thank you for your interst.
>>
>>     Hivemall project does not have a plan to have a specific mailing
>>     list, I will answer following questions/comments on twitter or
>>     through Github issues (with a question label).
>>
>>     BTW, I just added a CTR (Click-Through-Rate) prediction example that
>> is
>>     provided by a commercial search engine provider for the KDDCup 2012
>>     track 2.
>>     https://github.com/myui/__**hivemall/wiki/KDDCup-2012-__**
>> track-2-CTR-prediction-dataset<https://github.com/myui/__hivemall/wiki/KDDCup-2012-__track-2-CTR-prediction-dataset>
>>
>>     <https://github.com/myui/**hivemall/wiki/KDDCup-2012-**
>> track-2-CTR-prediction-dataset<https://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-dataset>
>> **>
>>
>>     I guess many of you working on ad CTR/CVR predictions. This example
>>     might be some help understanding how to do it only within Hive.
>>
>>     Thanks,
>>     Makoto @myui
>>
>>
>>     (2013/10/04 23:02), Edward Capriolo wrote:
>>
>>         Looks cool im already starting to play with it.
>>
>>         On Friday, October 4, 2013, Makoto Yui <yuin405@gmail.com
>>         <mailto:yuin405@gmail.com>
>>         <mailto:yuin405@gmail.com <mailto:yuin405@gmail.com>>> wrote:
>>           > Hi Dean,
>>           >
>>           > Thank you for your interest in Hivemall.
>>           >
>>           > Twitter's paper actually influenced me in developing
>>         Hivemall and I
>>           > initially implemented such functionality as Pig UDFs.
>>           >
>>           > Though my Pig ML library is not released, you can find a
>> similar
>>           > attempt for Pig in
>>           > https://github.com/y-tag/java-**__pig-MyUDFs<https://github.com/y-tag/java-__pig-MyUDFs>
>>
>>         <https://github.com/y-tag/**java-pig-MyUDFs<https://github.com/y-tag/java-pig-MyUDFs>
>> >
>>           >
>>           > Thanks,
>>           > Makoto
>>           >
>>           > 2013/10/3 Dean Wampler <deanwampler@gmail.com
>>         <mailto:deanwampler@gmail.com>
>>         <mailto:deanwampler@gmail.com <mailto:deanwampler@gmail.com>**
>> >__>:
>>
>>
>>           >> This is great news! I know that Twitter has done something
>>         similar
>>         with UDFs
>>           >> for Pig, as described in this paper:
>>           >>
>>         http://www.umiacs.umd.edu/~__**jimmylin/publications/Lin___**
>> Kolcz_SIGMOD2012.pdf<http://www.umiacs.umd.edu/~__jimmylin/publications/Lin___Kolcz_SIGMOD2012.pdf>
>>         <http://www.umiacs.umd.edu/%**7Ejimmylin/publications/Lin_**
>> Kolcz_SIGMOD2012.pdf<http://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf>
>> >
>>         <http://www.umiacs.umd.edu/%__**7Ejimmylin/publications/Lin___**
>> Kolcz_SIGMOD2012.pdf
>>
>>         <http://www.umiacs.umd.edu/%**7Ejimmylin/publications/Lin_**
>> Kolcz_SIGMOD2012.pdf<http://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf>
>> >>
>>
>>           >>
>>           >> I'm glad to see the same thing start with Hive.
>>           >>
>>           >> Dean
>>           >>
>>           >>
>>           >> On Wed, Oct 2, 2013 at 10:21 AM, Makoto YUI
>>         <yuin405@gmail.com <mailto:yuin405@gmail.com>
>>         <mailto:yuin405@gmail.com <mailto:yuin405@gmail.com>>> wrote:
>>           >>>
>>           >>> Hello all,
>>           >>>
>>           >>> My employer, AIST, has given the thumbs up to open source
>>         our machine
>>           >>> learning library, named Hivemall.
>>           >>>
>>           >>> Hivemall is a scalable machine learning library running on
>>         Hive/Hadoop,
>>           >>> licensed under the LGPL 2.1.
>>           >>>
>>           >>> https://github.com/myui/__**hivemall<https://github.com/myui/__hivemall>
>>
>>         <https://github.com/myui/**hivemall<https://github.com/myui/hivemall>
>> >
>>           >>>
>>           >>> Hivemall provides machine learning functionality as well
>>         as feature
>>           >>> engineering functions through UDFs/UDAFs/UDTFs of Hive. It
>>         is designed
>>           >>> to be scalable to the number of training instances as well
>>         as the
>>         number
>>           >>> of training features.
>>           >>>
>>           >>> Hivemall is very easy to use as every machine learning
>>         step is done
>>           >>> within HiveQL.
>>           >>>
>>           >>> -- Installation is just as follows:
>>           >>> add jar /tmp/hivemall.jar;
>>           >>> source /tmp/define-all.hive;
>>           >>>
>>           >>> -- Logistic regression is performed by a query.
>>           >>> SELECT
>>           >>>   feature,
>>           >>>   avg(weight) as weight
>>           >>> FROM
>>           >>>  (SELECT logress(features,label) as (feature,weight) FROM
>>           >>> training_features) t
>>           >>> GROUP BY feature;
>>           >>>
>>           >>> You can find detailed examples on our wiki pages.
>>           >>> https://github.com/myui/__**hivemall/wiki/_pages<https://github.com/myui/__hivemall/wiki/_pages>
>>
>>         <https://github.com/myui/**hivemall/wiki/_pages<https://github.com/myui/hivemall/wiki/_pages>
>> >
>>           >>>
>>           >>> Though we consider that Hivemall is much easier to use and
>>         more
>>         scalable
>>           >>> than Mahout for classification/regression tasks, please
>>         check it by
>>           >>> yourself. If you have a Hive environment, you can evaluate
>>         Hivemall
>>           >>> within 5 minutes or so.
>>           >>>
>>           >>> Hope you enjoy the release! Feedback (and pull request) is
>>         always
>>         welcome.
>>           >>>
>>           >>> Thank you,
>>           >>> Makoto
>>           >>
>>           >>
>>           >>
>>           >>
>>           >> --
>>           >> Dean Wampler, Ph.D.
>>           >> @deanwampler
>>           >> http://polyglotprogramming.com
>>           >
>>
>>
>>
>>
>


-- 
Nitin Pawar

Mime
View raw message