Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA39710736 for ; Fri, 11 Oct 2013 07:29:43 +0000 (UTC) Received: (qmail 73243 invoked by uid 500); 11 Oct 2013 07:29:40 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 72729 invoked by uid 500); 11 Oct 2013 07:29:31 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 72720 invoked by uid 99); 11 Oct 2013 07:29:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Oct 2013 07:29:30 +0000 X-ASF-Spam-Status: No, hits=0.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuin405@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pb0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Oct 2013 07:29:25 +0000 Received: by mail-pb0-f48.google.com with SMTP id ma3so3775426pbc.21 for ; Fri, 11 Oct 2013 00:29:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=wXq+EwXdKbY9tli1NtexSmdc21w5GLwI1kz5uV2eplM=; b=B8zssYlGhGvpdKZ2gRWO/10w9usNl5QRDP6ME73TWwTuw5A5kaCMaQ8LS2bxO0kOdn I72jxUFRpywvFJzPzENiLanJ5kjzLbijMzogdNo+VAcDzNA9u48X0bgpmz4WIa/aCSh+ rGEob76NxEURpfBct6jPRkN/lFKd5jm+BOs5fyQ6cXc1ZeAflDxn8mXilb2i0i31CHUN r2kzqNA26FOtxscTD/TGe9EKH9mVyVhaPV3OUF/nk5ja3g6X80UYTECSjGUITo4UGIgi gp3bjneYndE89G8JUZdNhGeanMgq5Q0uZ8YTxieT+82hK8k+Ayj6rC+bAl64+804J8hH LkYg== X-Received: by 10.67.24.7 with SMTP id ie7mr19681702pad.112.1381476545316; Fri, 11 Oct 2013 00:29:05 -0700 (PDT) Received: from [127.0.0.1] ([150.29.149.79]) by mx.google.com with ESMTPSA id rv9sm57926205pbc.4.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 11 Oct 2013 00:29:04 -0700 (PDT) Message-ID: <5257A893.8070405@gmail.com> Date: Fri, 11 Oct 2013 16:28:19 +0900 From: Makoto YUI User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: user@hive.apache.org Subject: Re: [ANN] Hivemall: Hive scalable machine learning library References: <524BD790.8050504@gmail.com> <524EEFEB.8010308@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Hi, I added support for the-state-of-the-art classifiers (those are not yet supported in Mahout) and Hivemall's cute(!?) logo as well in Hivemall 0.1-rc3. Newly supported classifiers include - Confidence Weighted (CW) - Adaptive Regularization of Weight Vectors (AROW) - Soft Confidence Weighted (SCW1, SCW2) Those classifiers are much smart comparing to the standard SGD-based or passive aggressive classifiers. Please check it out by yourself. Thanks, Makoto (2013/10/11 4:28), Clark Yang (杨卓荦) wrote: > I looks really cool, I think I will try it on. > > Cheers, > Zhuoluo (Clark) Yang > > > 2013/10/5 Makoto YUI > > > Hi Edward, > > Thank you for your interst. > > Hivemall project does not have a plan to have a specific mailing > list, I will answer following questions/comments on twitter or > through Github issues (with a question label). > > BTW, I just added a CTR (Click-Through-Rate) prediction example that is > provided by a commercial search engine provider for the KDDCup 2012 > track 2. > https://github.com/myui/__hivemall/wiki/KDDCup-2012-__track-2-CTR-prediction-dataset > > > I guess many of you working on ad CTR/CVR predictions. This example > might be some help understanding how to do it only within Hive. > > Thanks, > Makoto @myui > > > (2013/10/04 23:02), Edward Capriolo wrote: > > Looks cool im already starting to play with it. > > On Friday, October 4, 2013, Makoto Yui > >> wrote: > > Hi Dean, > > > > Thank you for your interest in Hivemall. > > > > Twitter's paper actually influenced me in developing > Hivemall and I > > initially implemented such functionality as Pig UDFs. > > > > Though my Pig ML library is not released, you can find a similar > > attempt for Pig in > > https://github.com/y-tag/java-__pig-MyUDFs > > > > > Thanks, > > Makoto > > > > 2013/10/3 Dean Wampler > >__>: > > >> This is great news! I know that Twitter has done something > similar > with UDFs > >> for Pig, as described in this paper: > >> > http://www.umiacs.umd.edu/~__jimmylin/publications/Lin___Kolcz_SIGMOD2012.pdf > > > > > >> > >> I'm glad to see the same thing start with Hive. > >> > >> Dean > >> > >> > >> On Wed, Oct 2, 2013 at 10:21 AM, Makoto YUI > > >> wrote: > >>> > >>> Hello all, > >>> > >>> My employer, AIST, has given the thumbs up to open source > our machine > >>> learning library, named Hivemall. > >>> > >>> Hivemall is a scalable machine learning library running on > Hive/Hadoop, > >>> licensed under the LGPL 2.1. > >>> > >>> https://github.com/myui/__hivemall > > >>> > >>> Hivemall provides machine learning functionality as well > as feature > >>> engineering functions through UDFs/UDAFs/UDTFs of Hive. It > is designed > >>> to be scalable to the number of training instances as well > as the > number > >>> of training features. > >>> > >>> Hivemall is very easy to use as every machine learning > step is done > >>> within HiveQL. > >>> > >>> -- Installation is just as follows: > >>> add jar /tmp/hivemall.jar; > >>> source /tmp/define-all.hive; > >>> > >>> -- Logistic regression is performed by a query. > >>> SELECT > >>> feature, > >>> avg(weight) as weight > >>> FROM > >>> (SELECT logress(features,label) as (feature,weight) FROM > >>> training_features) t > >>> GROUP BY feature; > >>> > >>> You can find detailed examples on our wiki pages. > >>> https://github.com/myui/__hivemall/wiki/_pages > > >>> > >>> Though we consider that Hivemall is much easier to use and > more > scalable > >>> than Mahout for classification/regression tasks, please > check it by > >>> yourself. If you have a Hive environment, you can evaluate > Hivemall > >>> within 5 minutes or so. > >>> > >>> Hope you enjoy the release! Feedback (and pull request) is > always > welcome. > >>> > >>> Thank you, > >>> Makoto > >> > >> > >> > >> > >> -- > >> Dean Wampler, Ph.D. > >> @deanwampler > >> http://polyglotprogramming.com > > > > >