Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C5020200C4E for ; Fri, 7 Apr 2017 01:31:47 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C3B9C160BA4; Thu, 6 Apr 2017 23:31:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E3F08160B91 for ; Fri, 7 Apr 2017 01:31:46 +0200 (CEST) Received: (qmail 52892 invoked by uid 500); 6 Apr 2017 23:31:46 -0000 Mailing-List: contact issues-help@systemml.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@systemml.incubator.apache.org Delivered-To: mailing list issues@systemml.incubator.apache.org Received: (qmail 52874 invoked by uid 99); 6 Apr 2017 23:31:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Apr 2017 23:31:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 48CC0C05E9 for ; Thu, 6 Apr 2017 23:31:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id X9zzRIH1QkPR for ; Thu, 6 Apr 2017 23:31:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 148075FCDE for ; Thu, 6 Apr 2017 23:31:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 59E25E0B33 for ; Thu, 6 Apr 2017 23:31:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id A02902406F for ; Thu, 6 Apr 2017 23:31:41 +0000 (UTC) Date: Thu, 6 Apr 2017 23:31:41 +0000 (UTC) From: "Deron Eriksson (JIRA)" To: issues@systemml.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (SYSTEMML-1471) Support PreparedScript for MLContext MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 06 Apr 2017 23:31:48 -0000 [ https://issues.apache.org/jira/browse/SYSTEMML-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959961#comment-15959961 ] Deron Eriksson edited comment on SYSTEMML-1471 at 4/6/17 11:31 PM: ------------------------------------------------------------------- I think you might be able to do something like this with the existing API. You could create a ScoringScriptExecutor class that extends ScriptExecutor. On this class, create a prepare(Script) method that contains: {code} setup(script); parseScript(); liveVariableAnalysis(); validateScript(); constructHops(); rewriteHops(); rewritePersistentReadsAndWrites(); constructLops(); generateRuntimeProgram(); showExplanation(); globalDataFlowOptimization(); countCompiledMRJobsAndSparkInstructions(); initializeCachingAndScratchSpace(); cleanupRuntimeProgram(); {code} Then override ScriptExecutor's execute(Script) method and have it contain: {code} if(statistics) { Statistics.startRunTimer(); } createAndInitializeExecutionContext(); executeRuntimeProgram(); cleanupAfterExecution(); // add symbol table to MLResults MLResults mlResults = new MLResults(script); script.setResults(mlResults); if (statistics) { Statistics.stopRunTimer(); System.out.println(Statistics.display(statisticsMaxHeavyHitters)); } return mlResults; {code} In the calling code, have something like: {code} ScoringScriptExecutor sse = new ScoringScriptExecutor(); sse.prepare(script); // create the dml program while (....) { ... MLResults results = ml.execute(script, sse); // execute the dml program } {code} was (Author: deron): I think you might be able to do something like this with the existing API. You could create a ScoringScriptExecutor class that extends ScriptExecutor. On this class, create a prepare(Script) method that contains: {code} setup(script); parseScript(); liveVariableAnalysis(); validateScript(); constructHops(); rewriteHops(); rewritePersistentReadsAndWrites(); constructLops(); generateRuntimeProgram(); showExplanation(); globalDataFlowOptimization(); countCompiledMRJobsAndSparkInstructions(); initializeCachingAndScratchSpace(); cleanupRuntimeProgram(); {code} Then override ScriptExecutor's execute(Script) method and have it contain: {code} script.clearAll(); if(statistics) { Statistics.startRunTimer(); } createAndInitializeExecutionContext(); executeRuntimeProgram(); cleanupAfterExecution(); // add symbol table to MLResults MLResults mlResults = new MLResults(script); script.setResults(mlResults); if (statistics) { Statistics.stopRunTimer(); System.out.println(Statistics.display(statisticsMaxHeavyHitters)); } return mlResults; {code} In the calling code, have something like: {code} ScoringScriptExecutor sse = new ScoringScriptExecutor(); sse.prepare(script); // create the dml program while (....) { ... MLResults results = ml.execute(script, sse); // execute the dml program } {code} > Support PreparedScript for MLContext > ------------------------------------ > > Key: SYSTEMML-1471 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1471 > Project: SystemML > Issue Type: Improvement > Reporter: Niketan Pansare > > The intent of this JIRA is three-fold: > 1. Allow MLContext to be used in prediction scenario. > 2. Consolidate the code of JMLC and MLContext. > 3. Explore what extensions are needed in SystemML to support Spark streaming. > For prediction scenario, it is important to reduce the parsing/validation overhead as much as possible and reusing the JMLC infrastructure might be a good step in that direction. It is also important that MLContext continues to support dynamic recompilation and other optimization as the input size could be small (similar to JMLC), but could also be large (if window size is large, making MLContext ideal for this scenario). > {code} > val streamingContext = new StreamingContext(sc, SLIDE_INTERVAL) > val windowDStream = .....window(WINDOW_LENGTH, SLIDE_INTERVAL) > val preparedScript = ....prepareScript(....) > windowDStream.foreachRDD(currentWindow => { > if (currentWindow.count() > 0) { > ml.execute(preparedScript.in("X", currentWindow.toDF())) > ... > } > }) > {code} > [~deron] [~mboehm7] [~reinwald] [~freiss] [~mwdusenb@us.ibm.com] [~nakul02] Is this something that interest anyone of you ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)