Return-Path: X-Original-To: apmail-madlib-dev-archive@minotaur.apache.org Delivered-To: apmail-madlib-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 101BC1856A for ; Wed, 25 Nov 2015 08:37:01 +0000 (UTC) Received: (qmail 60175 invoked by uid 500); 25 Nov 2015 08:37:01 -0000 Delivered-To: apmail-madlib-dev-archive@madlib.apache.org Received: (qmail 60144 invoked by uid 500); 25 Nov 2015 08:37:01 -0000 Mailing-List: contact dev-help@madlib.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@madlib.incubator.apache.org Delivered-To: mailing list dev@madlib.incubator.apache.org Received: (qmail 60133 invoked by uid 99); 25 Nov 2015 08:37:00 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Nov 2015 08:37:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 5A7931A424A for ; Wed, 25 Nov 2015 08:37:00 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.427 X-Spam-Level: X-Spam-Status: No, score=0.427 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.554, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id JCR88rbPmHos for ; Wed, 25 Nov 2015 08:36:52 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with SMTP id 7D36924E50 for ; Wed, 25 Nov 2015 08:36:51 +0000 (UTC) Received: (qmail 59847 invoked by uid 99); 25 Nov 2015 08:36:50 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Nov 2015 08:36:50 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 2916CE03CE; Wed, 25 Nov 2015 08:36:50 +0000 (UTC) From: mktal To: dev@madlib.incubator.apache.org Reply-To: dev@madlib.incubator.apache.org References: In-Reply-To: Subject: [GitHub] incubator-madlib pull request: SVM: Add cross validation support a... Content-Type: text/plain Message-Id: <20151125083650.2916CE03CE@git1-us-west.apache.org> Date: Wed, 25 Nov 2015 08:36:50 +0000 (UTC) Github user mktal commented on a diff in the pull request: https://github.com/apache/incubator-madlib/pull/4#discussion_r45840119 --- Diff: src/ports/postgres/modules/svm/svm.py_in --- @@ -440,55 +595,68 @@ def _process_epsilon(is_svc, args): def _extract_params(schema_madlib, params, module='SVM'): # NOTICE: the type of values in params_default should be consistent with # the types specified in params_types - params_default = {'init_stepsize': 0.01, - 'decay_factor': 0.9, - 'max_iter': 100, - 'tolerance': 1e-10, - 'lambda': 1.0, - 'norm': 'L2', - 'n_folds': 0, - 'epsilon': 0.01, - 'eps_table': ''} - - params_types = {'init_stepsize': float, - 'decay_factor': float, - 'max_iter': int, - 'tolerance': float, - 'lambda': list, - 'norm': str, - 'n_folds': int, - 'epsilon': float, - 'eps_table': str} + params_default = { --- End diff -- Early stopping in optimization serves the similar purposes as regularizations do. Specifying those optimization parameters as lists allows users to CV over them so that they are chosen to provide the `best` generalization ability while avoiding over- or under-fitting. Another practical concern is to make it easier for users to choose a proper init_stepsize which varies from case to case depending on both the model and the data. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---