From dev-return-2725-archive-asf-public=cust-asf.ponee.io@madlib.apache.org Fri Jan 19 18:41:12 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 1D37B180718 for ; Fri, 19 Jan 2018 18:41:12 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 0D447160C1B; Fri, 19 Jan 2018 17:41:12 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3988E160C4B for ; Fri, 19 Jan 2018 18:41:11 +0100 (CET) Received: (qmail 63145 invoked by uid 500); 19 Jan 2018 17:41:10 -0000 Mailing-List: contact dev-help@madlib.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@madlib.apache.org Delivered-To: mailing list dev@madlib.apache.org Received: (qmail 62949 invoked by uid 99); 19 Jan 2018 17:41:10 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Jan 2018 17:41:10 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 3D6FDE0FE7; Fri, 19 Jan 2018 17:41:07 +0000 (UTC) From: iyerr3 To: dev@madlib.apache.org Reply-To: dev@madlib.apache.org Message-ID: Subject: [GitHub] madlib pull request #229: SVM: Add minibatch as a new solver Content-Type: text/plain Date: Fri, 19 Jan 2018 17:41:07 +0000 (UTC) GitHub user iyerr3 opened a pull request: https://github.com/apache/madlib/pull/229 SVM: Add minibatch as a new solver Additional author: Nikhil Kak This work is based on the original work by Xiaocheng Tang in #75. This PR adds two main features: 1. A Minibatch solver that takes as input a batch of data 2. Minibatching for SVM You can merge this pull request into a Git repository by running: $ git pull https://github.com/iyerr3/incubator-madlib feature/svm_minibatch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/229.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #229 ---- commit 8dde3fc0e42be6fbd97585cc046060b84d624da1 Author: Rahul Iyer Date: 2018-01-08T21:21:16Z New module: Add minibatch as additional optimization framework commit 8b0af20c79af41d6c5fe023b32bd885f15df5bd1 Author: Rahul Iyer Date: 2018-01-08T21:23:55Z SVM: Add minibatch capabilities commit 2526d61f36e740df47e6b103e485133deb99ec43 Author: Rahul Iyer Date: 2018-01-09T00:55:34Z Add new dataset for batch commit 6a8649771d7950d376285b24e25070fd524be519 Author: Nikhil Kak Date: 2018-01-09T18:52:18Z Add install-check test for svm minibatch commit b5d1adbc5b501640cb0230d5622143d9b25ce4f5 Author: Nikhil Kak Date: 2018-01-10T18:27:57Z Add predict call for svm installcheck test commit a943b1ac6c162a33eaf269d775061ddc559dd360 Author: Rahul Iyer Date: 2018-01-10T23:30:00Z Update model in getLoss function commit 971403769f65a05e95a07d2df44c8a30120d025c Author: Nikhil Kak Date: 2018-01-11T00:46:07Z Refactor svm minibatch to add comments and update variable names. We are now using a ColumnVector instead of MappedColumnVector because the minibatch transition function wasn't able to convert ColumnVector to MappedColumnVector. This required us to not rebind tuple.depVar and instead just assign it to y. commit c86a36ca149a99d569ba54ddaaabe585f729df59 Author: Nikhil Kak Date: 2018-01-11T21:47:16Z Add classification test for svm minibatch and add relevane asserts. commit a43bff2c0392770148fba2d04fb75c19d48803ee Author: Rahul Iyer Date: 2018-01-12T00:42:13Z SVM: Fix classification with minibatching Changes: - Unnest minibatch array data to get all dependent labels - Transform dependent labels to 1/-1 by unnesting and then rebuilding array - Update install-check to use text labels and better thresholds for assert commit 5d29f4ed7eaa8bdac16b72136f70aeca7997b54d Author: Nikhil Kak Date: 2018-01-12T01:30:21Z Use correct count for averaging the gradient/loss commit b9a69ddabdade4a3f57577de391d0f093ed93630 Author: Rahul Iyer Date: 2018-01-12T21:48:45Z SVM: Fix minibatch data in install-check commit 60bda8a147d3a354329c9720fd41bb13650799a9 Author: Rahul Iyer Date: 2018-01-18T00:09:49Z Add assert for data validation (+ comment updates) ---- ---