From commits-return-4832-archive-asf-public=cust-asf.ponee.io@predictionio.apache.org Wed Mar 14 23:08:18 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 8391F18072F for ; Wed, 14 Mar 2018 23:08:16 +0100 (CET) Received: (qmail 64629 invoked by uid 500); 14 Mar 2018 22:08:15 -0000 Mailing-List: contact commits-help@predictionio.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@predictionio.apache.org Delivered-To: mailing list commits@predictionio.apache.org Received: (qmail 64620 invoked by uid 99); 14 Mar 2018 22:08:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2018 22:08:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 1A20D1806E6 for ; Wed, 14 Mar 2018 22:08:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -12.531 X-Spam-Level: X-Spam-Status: No, score=-12.531 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id pdTaAmaJGvr9 for ; Wed, 14 Mar 2018 22:08:03 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 85DDB60CD7 for ; Wed, 14 Mar 2018 22:08:02 +0000 (UTC) Received: (qmail 63193 invoked by uid 99); 14 Mar 2018 22:08:01 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2018 22:08:01 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 80DB3F661C; Wed, 14 Mar 2018 22:08:00 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: git-site-role@apache.org To: commits@predictionio.incubator.apache.org Date: Wed, 14 Mar 2018 22:08:14 -0000 Message-Id: <714b16fa70b74f409ff173762709ae54@git.apache.org> In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [15/51] [partial] predictionio-site git commit: Documentation based on apache/predictionio#439b87e07a59021839ea3fe2cd40f98fb8d4cc5f http://git-wip-us.apache.org/repos/asf/predictionio-site/blob/9d2bd407/evaluation/paramtuning/index.html ---------------------------------------------------------------------- diff --git a/evaluation/paramtuning/index.html b/evaluation/paramtuning/index.html index 5c5b0d1..d468550 100644 --- a/evaluation/paramtuning/index.html +++ b/evaluation/paramtuning/index.html @@ -1,4 +1,4 @@ -Hyperparameter Tuning
< div class="hidden-md hidden-lg" id="mobile-page-heading-wrapper">

PredictionIO Docs

Hyperparameter Tuning

A PredictionIO engine is instantiated by a set of parameters. These parameters define which algorithm is to be used, as well supply the parameters for the algorithm itself. This naturally raises the question of how to choose the best set of parameters. The evaluation module streamlines the process of tuning the engine to the best parameter set and deploys it.

Quick Start

We demonstrate the evaluation with the classification template. The classification template uses a naive bayesian algorithm that has a smoothing parameter. We evaluate the prediction quality against different parameter values to find the best parameter values, and then deploy it.

Edit the AppId

Edit MyClassification/src/main/scala/Evaluation.scala to specify the appId you used to import the data.

1
+Hyperparameter Tuning

A PredictionIO engine is instantiated by a set of parameters. These parameters define which algorithm is to be used, as well supply the parameters for the algorithm itself. This naturally raises the question of how to choose the best set of parameters. The evaluation module streamlines the process of tuning the engine to the best parameter set and deploys it.

Quick Start

We demonstrate the evaluation with the classification template. The classification template uses a naive bayesian algorithm that has a smoothing parameter. We evaluate the prediction quality against different parameter values to find the best parameter values, and then deploy it.

Edit the AppId

Edit MyClassifica tion/src/main/scala/Evaluation.scala to specify the appId you used to import the data.

1
 2
 3
 4
@@ -385,7 +385,7 @@ Metrics:
   org.template.classification.Accuracy: 0.9281045751633987
 The best variant params can be found in best.json
 [INFO] [CoreWorkflow$] runEvaluation completed
-

Notes

  • We deliberately not metion test set in this hyperparameter tuning guide. In machine learning literature, the test set is a separate piece of data which is used to evaluate the final engine params outputted by the evaluation process. This guarantees that no information in the training / validation set is leaked into the engine params and yields a biased outcome. With PredictionIO, there are multiple ways of conducting robust tuning, we will cover this topic in the coming sections.
PredictionIO on Twitter PredictionIO on Facebook