Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 853BF200D21 for ; Mon, 16 Oct 2017 20:44:55 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 83D4B1609EF; Mon, 16 Oct 2017 18:44:55 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3695A1609E3 for ; Mon, 16 Oct 2017 20:44:53 +0200 (CEST) Received: (qmail 34650 invoked by uid 500); 16 Oct 2017 18:44:52 -0000 Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@predictionio.incubator.apache.org Delivered-To: mailing list user@predictionio.incubator.apache.org Received: (qmail 34640 invoked by uid 99); 16 Oct 2017 18:44:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Oct 2017 18:44:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 4BB37180652 for ; Mon, 16 Oct 2017 18:44:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.297 X-Spam-Level: X-Spam-Status: No, score=-0.297 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, HTML_OBFUSCATE_05_10=0.001, KAM_SHORT=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.8, RCVD_IN_SORBS_SPAM=0.5, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=occamsmachete-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id p6rBgb1GFqWH for ; Mon, 16 Oct 2017 18:44:43 +0000 (UTC) Received: from mail-pf0-f196.google.com (mail-pf0-f196.google.com [209.85.192.196]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 22C3A5FBB0 for ; Mon, 16 Oct 2017 18:44:42 +0000 (UTC) Received: by mail-pf0-f196.google.com with SMTP id b6so8031755pfh.7 for ; Mon, 16 Oct 2017 11:44:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=occamsmachete-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=q5/vVRoqUhrMk7JpVTF1rg57nTsH42y1E0ZDp1CDeqc=; b=EwRSjp2iwaZhJQAmrGbHjodXoKDuXboKCMuHXAObJsiotd/x5Fdfk616hWpr5WgFLU c4Ph9Rz1FLKtdkIRRr07MiXgL/SbnbvkAHC73bPMqt66SITRb8lrYbMJ1QNYF+Ln2Prg pvnlLC1Se/E2zJ9W6DdgAFttSV2o5vEp60Ff1mrq2V34a5jtChIraCI5JZtM5t7r9m14 lAnRMJj64UmWtkwAu8UMo4rAGGjqvDaRA2l/1Eu0uaVqbik6tl/d5Y55eZu5L8lTkCwa OU4NQ7qmV6djKmW6ZjDDwN+3f1sHq16ewB/oy/kcKO572bOmNsYjhNhQVw4Lk1tvA8wC Prsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=q5/vVRoqUhrMk7JpVTF1rg57nTsH42y1E0ZDp1CDeqc=; b=HHR1NoSGvlYvkMc9w4OJUGD/8sRTNCReLbGDK1RATgfJ310N6rZaFI+5GS2kMjL8mF 4//nwDDLnpQVc+8s6mvG32K9Ctyl68Ngz+Lqqj57Fr+99W5yU3cwj+YPokjymJlNegtE yjcOmbzGC8/Fui60CCkcsbrS9YIOyWFbAjLZ2m1PST8HNRanVzsWCVi6AKiTyhEWE9ei rC4103v3VDlh6+6dGB9MRAGt2rhoLl0YyPKfEG08o00uXqT083fl0pUqgba6LCH9I0Ux WBvmlX7r08fa9xzaVbX9DNRmHQfjvPxIiv2CdsMl1xzzyvuCeZEMgfcPgPC3Laiy4pZK tp1Q== X-Gm-Message-State: AMCzsaUKkqJHQtTaZ7DttQFg07QM/1fKSD+cHq9isg65cZa+UZI+YYDF SPZ5C5xeCnuhPA4RiwQwxIgYrQ== X-Google-Smtp-Source: AOwi7QB8hIROIVJ7maB7byZrKoOUskrBaU4EGo5dN5bteN0p/NyMeQjAmV4Uj5ZFUcCBxhXRATJEBg== X-Received: by 10.99.125.18 with SMTP id y18mr8792187pgc.428.1508179480389; Mon, 16 Oct 2017 11:44:40 -0700 (PDT) Received: from [192.168.0.4] (c-24-18-213-211.hsd1.wa.comcast.net. [24.18.213.211]) by smtp.gmail.com with ESMTPSA id p189sm12847930pfp.170.2017.10.16.11.44.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Oct 2017 11:44:39 -0700 (PDT) From: Pat Ferrel Message-Id: <69FAC663-F461-4ADD-A475-F84422C0C825@occamsmachete.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_83C68C4A-1E54-41E9-BC91-87C01A584974" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 had a not serializable result Date: Mon, 16 Oct 2017 11:44:34 -0700 In-Reply-To: Cc: user@predictionio.incubator.apache.org, actionml-user To: =?utf-8?Q?Noelia_Os=C3=A9s_Fern=C3=A1ndez?= References: <89308E3D-4F19-4DE0-A8A8-6AEF6DA4EAAD@occamsmachete.com> <0EC36C23-2648-4101-8713-AE4E4D0D2C55@occamsmachete.com> <2CEA6918-B700-4A05-B84C-BAC2A04DA919@occamsmachete.com> X-Mailer: Apple Mail (2.3273) archived-at: Mon, 16 Oct 2017 18:44:55 -0000 --Apple-Mail=_83C68C4A-1E54-41E9-BC91-87C01A584974 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 So all setup is the same for the integration-test and your modified test = *except the data*? The error looks like a setup problem because the serialization should = happen with either test. But if the only difference really is the data, = then toss it and use either real data or the integration test data, why = are you trying to synthesize fake data if it causes the error? BTW the data you include below in this thread would never create = internal IDs as high as 94 in the vector. You must have switched to a = new dataset??? I would get a dump of your data using `pio export` and make sure it=E2=80=99= s what you thought it was. You claim to have only 4 user ids and 4 item = ids but the serialized vector thinks you have at least 94 of user or = item ids. Something doesn=E2=80=99t add up. On Oct 16, 2017, at 4:43 AM, Noelia Os=C3=A9s Fern=C3=A1ndez = wrote: Pat, you are absolutely right! I increased the sleep time and now the = integration test for handmade works perfectly. However, the integration test adapted to run with my tiny app runs into = the same problem I've been having with this app:=20 [ERROR] [TaskSetManager] Task 1.0 in stage 10.0 (TID 23) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: - object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: = {66:1.0,29:1.0,70:1.0,91:1.0,58:1.0,37:1.0,13:1.0,8:1.0,94:1.0,30:1.0,57:1= .0,22:1.0,20:1.0,35:1.0,97:1.0,60:1.0,27:1.0,72:1.0,3:1.0,34:1.0,77:1.0,46= :1.0,81:1.0,86:1.0,43:1.0}) - field (class: scala.Tuple2, name: _2, type: class = java.lang.Object) - object (class scala.Tuple2, = (1,{66:1.0,29:1.0,70:1.0,91:1.0,58:1.0,37:1.0,13:1.0,8:1.0,94:1.0,30:1.0,5= 7:1.0,22:1.0,20:1.0,35:1.0,97:1.0,60:1.0,27:1.0,72:1.0,3:1.0,34:1.0,77:1.0= ,46:1.0,81:1.0,86:1.0,43:1.0})); not retrying [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 (TID 24) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: ... Any ideas? On 15 October 2017 at 19:09, Pat Ferrel > wrote: This is probably a timing issue in the integration test, which has to = wait for `pio deploy` to finish before the queries can be made. If it = doesn=E2=80=99t finish the queries will fail. By the time the rest of = the test quits the model has been deployed so you can run queries. In = the integration-test script increase the delay after `pio deploy=E2=80=A6`= and see if it passes then. This is probably an integrtion-test script problem not a problem in the = system On Oct 6, 2017, at 4:21 AM, Noelia Os=C3=A9s Fern=C3=A1ndez = > wrote: Pat, I have run the integration test for the handmade example out of = curiosity. Strangely enough things go more or less as expected apart = from the fact that I get a message saying: ... [INFO] [CoreWorkflow$] Updating engine instance [INFO] [CoreWorkflow$] Training completed successfully. Model will remain deployed after this test Waiting 30 seconds for the server to start nohup: redirecting stderr to stdout % Total % Received % Xferd Average Speed Time Time Time = Current Dload Upload Total Spent Left = Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- = 0curl: (7) Failed to connect to localhost port 8000: Connection = refused So the integration test does not manage to get the recommendations even = though the model trained and deployed successfully. However, as soon as = the integration test finishes, on the same terminal, I can get the = recommendations by doing the following: $ curl -H "Content-Type: application/json" -d ' > { > "user": "u1" > }' http://localhost:8000/queries.json = = {"itemScores":[{"item":"Nexus","score":0.057719700038433075},{"item":"Surf= ace","score":0.0}]} Isn't this odd? Can you guess what's going on? Thank you very much for all your support! noelia On 5 October 2017 at 19:22, Pat Ferrel > wrote: Ok, that config should work. Does the integration test pass? The data you are using is extremely small and though it does look like = it has cooccurrences, they may not meet minimum =E2=80=9Cbig-data=E2=80=9D= thresholds used by default. Try adding more data or use the handmade = example data, rename purchase to view and discard the existing view data = if you wish. The error is very odd and I=E2=80=99ve never seen it. If the integration = test works I can only surmise it's your data. On Oct 5, 2017, at 12:02 AM, Noelia Os=C3=A9s Fern=C3=A1ndez = > wrote: SPARK: spark-1.6.3-bin-hadoop2.6 PIO: 0.11.0-incubating Scala: whatever gets installed when installing PIO 0.11.0-incubating, I = haven't installed Scala separately UR: ActionML's UR v0.6.0 I suppose as that's the last version mentioned = in the readme file. I have attached the UR zip file I downloaded from = the actionml github account. Thank you for your help!! On 4 October 2017 at 17:20, Pat Ferrel > wrote: What version of Scala. Spark, PIO, and UR are you using? On Oct 4, 2017, at 6:10 AM, Noelia Os=C3=A9s Fern=C3=A1ndez = > wrote: Hi all, I'm still trying to create a very simple app to learn to use = PredictionIO and still having trouble. I have done pio build no problem. = But when I do pio train I get a very long error message related to = serialisation (error message copied below). pio status reports system is all ready to go. The app I'm trying to build is very simple, it only has 'view' events. = Here's the engine.json: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D { "comment":" This config file uses default settings for all but the = required values see README.md for docs", "id": "default", "description": "Default settings", "engineFactory": "com.actionml.RecommendationEngine", "datasource": { "params" : { "name": "tiny_app_data.csv", "appName": "TinyApp", "eventNames": ["view"] } }, "algorithms": [ { "comment": "simplest setup where all values are default, = popularity based backfill, must add eventsNames", "name": "ur", "params": { "appName": "TinyApp", "indexName": "urindex", "typeName": "items", "comment": "must have data for the first event or the model will = not build, other events are optional", "eventNames": ["view"] } } ] } =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D The data I'm using is: "u1","i1" "u2","i1" "u2","i2" "u3","i2" "u3","i3" "u4","i4" meaning user u viewed item i. The data has been added to the database with the following python code: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D """ Import sample data for recommendation engine """ import predictionio import argparse import random RATE_ACTIONS_DELIMITER =3D "," SEED =3D 1 def import_events(client, file): f =3D open(file, 'r') random.seed(SEED) count =3D 0 print "Importing data..." items =3D [] users =3D [] f =3D open(file, 'r') for line in f: data =3D line.rstrip('\r\n').split(RATE_ACTIONS_DELIMITER) users.append(data[0]) items.append(data[1]) client.create_event( event=3D"view", entity_type=3D"user", entity_id=3Ddata[0], target_entity_type=3D"item", target_entity_id=3Ddata[1] ) print "Event: " + "view" + " entity_id: " + data[0] + " = target_entity_id: " + data[1] count +=3D 1 f.close() users =3D set(users) items =3D set(items) print "All users: " + str(users) print "All items: " + str(items) for item in items: client.create_event( event=3D"$set", entity_type=3D"item", entity_id=3Ditem ) count +=3D 1 print "%s events are imported." % count if __name__ =3D=3D '__main__': parser =3D argparse.ArgumentParser( description=3D"Import sample data for recommendation engine") parser.add_argument('--access_key', default=3D'invald_access_key') parser.add_argument('--url', default=3D"http://localhost:7070 = ") parser.add_argument('--file', default=3D"./data/tiny_app_data.csv") args =3D parser.parse_args() print args client =3D predictionio.EventClient( access_key=3Dargs.access_key, url=3Dargs.url, threads=3D5, qsize=3D500) import_events(client, args.file) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D My pio_env.sh is the following: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D #!/usr/bin/env bash # # Copy this file as pio-env.sh and edit it for your site's = configuration. # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version = 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 = # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or = implied. # See the License for the specific language governing permissions and # limitations under the License. # # PredictionIO Main Configuration # # This section controls core behavior of PredictionIO. It is very likely = that # you need to change these to fit your site. # SPARK_HOME: Apache Spark is a hard dependency and must be configured. # SPARK_HOME=3D$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7 SPARK_HOME=3D$PIO_HOME/vendors/spark-1.6.3-bin-hadoop2.6 POSTGRES_JDBC_DRIVER=3D$PIO_HOME/lib/postgresql-42.1.4.jar MYSQL_JDBC_DRIVER=3D$PIO_HOME/lib/mysql-connector-java-5.1.41.jar # ES_CONF_DIR: You must configure this if you have advanced = configuration for # your Elasticsearch setup. # ES_CONF_DIR=3D/opt/elasticsearch #ES_CONF_DIR=3D$PIO_HOME/vendors/elasticsearch-1.7.6 # HADOOP_CONF_DIR: You must configure this if you intend to run = PredictionIO # with Hadoop 2. # HADOOP_CONF_DIR=3D/opt/hadoop # HBASE_CONF_DIR: You must configure this if you intend to run = PredictionIO # with HBase on a remote cluster. # HBASE_CONF_DIR=3D$PIO_HOME/vendors/hbase-1.0.0/conf # Filesystem paths where PredictionIO uses as block storage. PIO_FS_BASEDIR=3D$HOME/.pio_store PIO_FS_ENGINESDIR=3D$PIO_FS_BASEDIR/engines PIO_FS_TMPDIR=3D$PIO_FS_BASEDIR/tmp # PredictionIO Storage Configuration # # This section controls programs that make use of PredictionIO's = built-in # storage facilities. Default values are shown below. # # For more information on storage configuration please refer to # http://predictionio.incubator.apache.org/system/anotherdatastore/ = # Storage Repositories # Default is to use PostgreSQL PIO_STORAGE_REPOSITORIES_METADATA_NAME=3Dpio_meta PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=3DELASTICSEARCH PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=3Dpio_event PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=3DHBASE PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=3Dpio_model PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=3DLOCALFS # Storage Data Sources # PostgreSQL Default Settings # Please change "pio" to your database name in = PIO_STORAGE_SOURCES_PGSQL_URL # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly PIO_STORAGE_SOURCES_PGSQL_TYPE=3Djdbc PIO_STORAGE_SOURCES_PGSQL_URL=3Djdbc:postgresql://localhost/pio <> PIO_STORAGE_SOURCES_PGSQL_USERNAME=3Dpio PIO_STORAGE_SOURCES_PGSQL_PASSWORD=3Dpio # MySQL Example # PIO_STORAGE_SOURCES_MYSQL_TYPE=3Djdbc # PIO_STORAGE_SOURCES_MYSQL_URL=3Djdbc:mysql://localhost/pio <> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=3Dpio # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=3Dpio # Elasticsearch Example # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=3Delasticsearch # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=3Dlocalhost # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=3D9200 # PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=3Dhttp # = PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=3D$PIO_HOME/vendors/elasticsearch-5= .2.1 # Elasticsearch 1.x Example PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=3Delasticsearch PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=3DmyprojectES PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=3Dlocalhost PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=3D9300 = PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=3D$PIO_HOME/vendors/elasticsearch-1= .7.6 # Local File System Example PIO_STORAGE_SOURCES_LOCALFS_TYPE=3Dlocalfs PIO_STORAGE_SOURCES_LOCALFS_PATH=3D$PIO_FS_BASEDIR/models # HBase Example PIO_STORAGE_SOURCES_HBASE_TYPE=3Dhbase PIO_STORAGE_SOURCES_HBASE_HOME=3D$PIO_HOME/vendors/hbase-1.2.6 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D Error message: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 (TID 24) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: - object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: {3:1.0,2:1.0}) - field (class: scala.Tuple2, name: _2, type: class = java.lang.Object) - object (class scala.Tuple2, (2,{3:1.0,2:1.0})); not retrying [ERROR] [TaskSetManager] Task 3.0 in stage 10.0 (TID 25) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: - object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: {0:1.0,3:1.0}) - field (class: scala.Tuple2, name: _2, type: class = java.lang.Object) - object (class scala.Tuple2, (3,{0:1.0,3:1.0})); not retrying [ERROR] [TaskSetManager] Task 1.0 in stage 10.0 (TID 23) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: - object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: {1:1.0}) - field (class: scala.Tuple2, name: _2, type: class = java.lang.Object) - object (class scala.Tuple2, (1,{1:1.0})); not retrying [ERROR] [TaskSetManager] Task 0.0 in stage 10.0 (TID 22) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: - object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: {0:1.0}) - field (class: scala.Tuple2, name: _2, type: class = java.lang.Object) - object (class scala.Tuple2, (0,{0:1.0})); not retrying Exception in thread "main" org.apache.spark.SparkException: Job aborted = due to stage failure: Task 2.0 in stage 10.0 (TID 24) had a not = serializable result: org.apache.mahout.math.RandomAccessSparseVector Serialization stack: - object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: {3:1.0,2:1.0}) - field (class: scala.Tuple2, name: _2, type: class = java.lang.Object) - object (class scala.Tuple2, (2,{3:1.0,2:1.0})) at org.apache.spark.scheduler.DAGScheduler.org = $apache$spark$schedul= er$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at = org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGSch= eduler.scala:1419) at = org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGSch= eduler.scala:1418) at = scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala= :59) at = scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at = org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418= ) at = org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.app= ly(DAGScheduler.scala:799) at = org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.app= ly(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at = org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.s= cala:799) at = org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGSch= eduler.scala:1640) at = org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGSched= uler.scala:1599) at = org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGSched= uler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at = org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952) at org.apache.spark.rdd.RDD$$anonfun$fold$1.apply(RDD.scala:1088) at = org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:= 150) at = org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:= 111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.fold(RDD.scala:1082) at org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.com = puteNRow(CheckpointedDrmSpark.scal= a:188) at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.nrow$lzycompute(C= heckpointedDrmSpark.scala:55) at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.nrow(Checkpointed= DrmSpark.scala:55) at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.newRowCardinality= (CheckpointedDrmSpark.scala:219) at com.actionml.IndexedDatasetSpark$.apply(Preparator.scala:213) at com.actionml.Preparator$$anonfun$3.apply(Preparator.scala:71) at com.actionml.Preparator$$anonfun$3.apply(Preparator.scala:49) at = scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scal= a:244) at = scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scal= a:244) at scala.collection.immutable.List.foreach(List.scala:318) at = scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at com.actionml.Preparator.prepare(Preparator.scala:49) at com.actionml.Preparator.prepare(Preparator.scala:32) at = org.apache.predictionio.controller.PPreparator.prepareBase(PPreparator.sca= la:37) at = org.apache.predictionio.controller.Engine$.train(Engine.scala:671) at org.apache.predictionio.controller.Engine.train(Engine.scala:177) at = org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala= :67) at = org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala= :250) at = org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)= at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at = sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:= 62) at = sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm= pl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at = org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$= runMain(SparkSubmit.scala:731) at = org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at = org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D Thank you all for your help. Best regards, noelia --=20 Noelia Os=C3=A9s Fern=C3=A1ndez, PhD Senior Researcher | Investigadora Senior noses@vicomtech.org +[34] 943 30 92 30 Data Intelligence for Energy and Industrial Processes | Inteligencia de Datos para Energ=C3=ADa y Procesos Industriales = = member of: Legal Notice - Privacy policy = --=20 Noelia Os=C3=A9s Fern=C3=A1ndez, PhD Senior Researcher | Investigadora Senior noses@vicomtech.org +[34] 943 30 92 30 Data Intelligence for Energy and Industrial Processes | Inteligencia de Datos para Energ=C3=ADa y Procesos Industriales = = member of: Legal Notice - Privacy policy = --=20 You received this message because you are subscribed to the Google = Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/CAMyseftewAGvt2_XPsRQrDvmF= Vti4sZLFkZZc_ygpB8k%2Bmjq4A%40mail.gmail.com = . For more options, visit https://groups.google.com/d/optout = . --=20 Noelia Os=C3=A9s Fern=C3=A1ndez, PhD Senior Researcher | Investigadora Senior noses@vicomtech.org +[34] 943 30 92 30 Data Intelligence for Energy and Industrial Processes | Inteligencia de Datos para Energ=C3=ADa y Procesos Industriales = = member of: Legal Notice - Privacy policy = --Apple-Mail=_83C68C4A-1E54-41E9-BC91-87C01A584974 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 So all setup is the same for the integration-test and your = modified test *except the data*?

The error looks like a setup problem because the = serialization should happen with either test. But if the only difference = really is the data, then toss it and use either real data or the = integration test data, why are you trying to synthesize fake data if it = causes the error?

BTW the data you include below in this thread would never = create internal IDs as high as 94 in the vector. You must have switched = to a new dataset???

I would get a dump of your data using `pio export` and make = sure it=E2=80=99s what you thought it was. You claim to have only 4 user = ids and 4 item ids but the serialized vector thinks you have at least 94 = of user or item ids. Something doesn=E2=80=99t add up.


On Oct 16, 2017, at 4:43 AM, Noelia = Os=C3=A9s Fern=C3=A1ndez <noses@vicomtech.org> wrote:

Pat, you are absolutely = right! I increased the sleep time and now the integration test for = handmade works perfectly.

However, = the integration test adapted to run with my tiny app runs into the same = problem I've been having with this app: 

[ERROR] [TaskSetManager] Task 1.0 in stage 10.0 (TID 23) had = a not serializable result: = org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:
    - = object not serializable (class: = org.apache.mahout.math.RandomAccessSparseVector, value: = {66:1.0,29:1.0,70:1.0,91:1.0,58:1.0,37:1.0,13:1.0,8:1.0,94:1.0,30:1.0,57:1= .0,22:1.0,20:1.0,35:1.0,97:1.0,60:1.0,27:1.0,72:1.0,3:1.0,34:1.0,77:1.0,46= :1.0,81:1.0,86:1.0,43:1.0})
    - field = (class: scala.Tuple2, name: _2, type: class java.lang.Object)
    - object (class scala.Tuple2, = (1,{66:1.0,29:1.0,70:1.0,91:1.0,58:1.0,37:1.0,13:1.0,8:1.0,94:1.0,30:1.0,5= 7:1.0,22:1.0,20:1.0,35:1.0,97:1.0,60:1.0,27:1.0,72:1.0,3:1.0,34:1.0,77:1.0= ,46:1.0,81:1.0,86:1.0,43:1.0})); not retrying
[ERROR] = [TaskSetManager] Task 2.0 in stage 10.0 (TID 24) had a not serializable = result: org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:

...

Any ideas?

On 15 October 2017 at 19:09, Pat Ferrel <pat@occamsmachete.com> wrote:
This is probably a timing = issue in the integration test, which has to wait for `pio deploy` to = finish before the queries can be made. If it doesn=E2=80=99t finish the = queries will fail. By the time the rest of the test quits the model has = been deployed so you can run queries. In the integration-test script = increase the delay after `pio deploy=E2=80=A6` and see if it passes = then.

This is = probably an integrtion-test script problem not a problem in the = system



On Oct 6, 2017, at 4:21 AM, = Noelia Os=C3=A9s Fern=C3=A1ndez <noses@vicomtech.org> = wrote:

Pat,

I have run the integration test for the = handmade example out of curiosity. Strangely enough things go more or = less as expected apart from the fact that I get a message saying:

...
[INFO] = [CoreWorkflow$] Updating engine instance
[INFO] = [CoreWorkflow$] Training completed successfully.
Model = will remain deployed after this test
Waiting 30 seconds = for the server to start
nohup: redirecting stderr to = stdout
  % Total    % = Received % Xferd  Average Speed   Time    = Time     Time  Current
          &nb= sp;            = ;          Dload  = Upload   Total   Spent    Left  = Speed
  0     = 0    0     0    = 0     0      = 0      0 --:--:-- --:--:-- = --:--:--     0curl: (7) Failed to connect to = localhost port 8000: Connection refused

So the integration test does not manage to get the = recommendations even though the model trained and deployed successfully. = However, as soon as the integration test finishes, on the same terminal, = I can get the recommendations by doing the following:

$ curl -H "Content-Type: application/json" -d '
> {
>     "user": = "u1"
> }' http://localhost:8000/queries.json
{"itemScores":[{"item":"Nexus","score":0.057719700038433075},{"item":"Surface","score":0.0}]}

Isn't this odd? Can you guess what's = going on?

Thank you very much for all = your support!
noelia



On 5 October 2017 at 19:22, Pat = Ferrel <pat@occamsmachete.com> wrote:
Ok, that config should work. Does the = integration test pass?

The data you are using is extremely small and though it does = look like it has cooccurrences, they may not meet minimum =E2=80=9Cbig-dat= a=E2=80=9D thresholds used by default. Try adding more data or use the = handmade example data, rename purchase to view and discard the existing = view data if you wish.

The error is very odd and I=E2=80=99ve never seen it. If the = integration test works I can only surmise it's your data.


On Oct 5, 2017, at 12:02 AM, Noelia Os=C3=A9s Fern=C3=A1ndez = <noses@vicomtech.org> wrote:

SPARK: = spark-1.6.3-bin-hadoop2.6

PIO: = 0.11.0-incubating

Scala: whatever = gets installed when installing PIO 0.11.0-incubating, I haven't = installed Scala separately

UR: = ActionML's UR v0.6.0 I suppose as that's the last version mentioned in = the readme file. I have attached the UR zip file I downloaded from the = actionml github account.

Thank you = for your help!!

On 4 October 2017 at 17:20, Pat = Ferrel <pat@occamsmachete.com> wrote:
What version of Scala. Spark, PIO, and UR are = you using?


On Oct 4, 2017, at 6:10 AM, Noelia Os=C3=A9s Fern=C3=A1ndez = <noses@vicomtech.org> wrote:

Hi = all,

I'm still = trying to create a very simple app to learn to use PredictionIO and = still having trouble. I have done pio build no problem. But when I do = pio train I get a very long error message related to serialisation = (error message copied below).

pio status reports system is all ready = to go.

The app I'm trying to build is very simple, it only has = 'view' events. Here's the engine.json:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D
{
  "comment"= :" This config file uses default settings for all but the required = values see README.md for docs",
  "id": = "default",
  "descript= ion": "Default settings",
  "engineFa= ctory": "com.actionml.RecommendationEngine",
  "datasour= ce": {
    "params" = : {
      "name": = "tiny_app_data.csv",
      "appName"= : "TinyApp",
      "eventNam= es": ["view"]
    }
  },
  "algorith= ms": [
    {
      "comment"= : "simplest setup where all values are default, popularity based = backfill, must add eventsNames",
      "name": = "ur",
      "params":= {
        "appName"= : "TinyApp",
        "indexNam= e": "urindex",
        "typeName= ": "items",
        "comment"= : "must have data for the first event or the model will not build, other = events are optional",
        "eventNam= es": ["view"]
      }
    }
  ]
}
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D

The data I'm using is:

"u1","i1"
"u2","i1"
"u2","i2"
"u3","i2"
"u3","i3"
"u4","i4"

meaning user u viewed item i.

The data has been added to the database = with the following python code:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D
"""
Import= sample data for recommendation engine
"""
import predictionio
import argparse
import random

RATE_ACTIONS_DELIMITER =3D ","
SEED =3D 1


def import_events(client, = file):
  f =3D = open(file, 'r')
  random.se= ed(SEED)
  count =3D= 0
  print = "Importing data..."

  items =3D= []
  users =3D= []
  f =3D = open(file, 'r')
  for = line in f:
    data =3D = line.rstrip('\r\n').split(RATE_ACTIONS_DELIMITER)
    users.app= end(data[0])
    items.app= end(data[1])
    client.cr= eate_event(
      event=3D"= view",
      entity_ty= pe=3D"user",
      entity_id= =3Ddata[0],
      target_en= tity_type=3D"item",
      target_en= tity_id=3Ddata[1]
    )
    print = "Event: " + "view" + " entity_id: " + data[0] + " target_entity_id: " + = data[1]
    count = +=3D 1
  f.close()=

  users =3D= set(users)
  items =3D= set(items)
  print = "All users: " + str(users)
  print = "All items: " + str(items)
  for = item in items:
    client.cr= eate_event(
      event=3D"= $set",
      entity_ty= pe=3D"item",
      entity_id= =3Ditem
    )
    count = +=3D 1


  print = "%s events are imported." % count


if __name__ =3D=3D '__main__':
  parser = =3D argparse.ArgumentParser(
    descripti= on=3D"Import sample data for recommendation engine")
  parser.ad= d_argument('--access_key', = default=3D'invald_access_key')
  parser.ad= d_argument('--url', default=3D"http://localhost:7070")
  parser.ad= d_argument('--file', default=3D"./data/tiny_app_data.csv")

  args =3D = parser.parse_args()
  print = args

  client = =3D predictionio.EventClient(
    access_ke= y=3Dargs.access_key,
    url=3Darg= s.url,
    threads=3D= 5,
    qsize=3D5= 00)
  import_ev= ents(client, args.file)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D

My pio_env.sh is the = following:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D
#!/usr/bin/env bash
#
# Copy this file as pio-env.sh and edit it = for your site's configuration.
#
# Licensed = to the Apache Software Foundation (ASF) under one or more
# = contributor license agreements.  See the NOTICE file distributed = with
# this work for additional information regarding = copyright ownership.
# The ASF licenses this file to You = under the Apache License, Version 2.0
# (the "License"); = you may not use this file except in compliance with
# the = License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# = Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" = BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, = either express or implied.
# See the License for the = specific language governing permissions and
# limitations = under the License.
#

# = PredictionIO Main Configuration
#
# This = section controls core behavior of PredictionIO. It is very likely = that
# you need to change these to fit your site.

# SPARK_HOME: Apache Spark is a hard = dependency and must be configured.
# = SPARK_HOME=3D$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
SPARK_HOME=3D$PIO_HOME/vendors/spark-1.6.3-bin-hadoop2.6

POSTGRES_JDBC_DRIVER=3D$PIO_HOME/lib/postgresql-42.1.4.jar
MYSQL_JDBC_DRIVER=3D$PIO_HOME/lib/mysql-connector-java-5.1.41.jar

# ES_CONF_DIR: You must configure this if you = have advanced configuration for
#          &n= bsp;   your Elasticsearch setup.
# = ES_CONF_DIR=3D/opt/elasticsearch
#ES_CONF_DIR=3D$PIO_HOME/vendors/elasticsearch-1.7.6

# = HADOOP_CONF_DIR: You must configure this if you intend to run = PredictionIO
#          &n= bsp;       with Hadoop 2.
# = HADOOP_CONF_DIR=3D/opt/hadoop

# = HBASE_CONF_DIR: You must configure this if you intend to run = PredictionIO
#          &n= bsp;      with HBase on a remote cluster.
# HBASE_CONF_DIR=3D$PIO_HOME/vendors/hbase-1.0.0/conf

# Filesystem = paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=3D$HOME/.pio_store
PIO_FS_ENGINESDIR=3D$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=3D$PIO_FS_BASEDIR/tmp

# PredictionIO Storage Configuration
#
# This section controls programs that make = use of PredictionIO's built-in
# storage facilities. = Default values are shown below.
#
# For more = information on storage configuration please refer to
# http://predictionio.incubator.apache.org/system/anotherdatastore/

# Storage = Repositories

# Default is to use = PostgreSQL
PIO_STORAGE_REPOSITORIES_METADATA_NAME=3Dpio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=3DELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=3Dpio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=3DHBASE

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=3Dpio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=3DLOCALFS

# Storage = Data Sources

# PostgreSQL Default = Settings
# Please change "pio" to your database name in = PIO_STORAGE_SOURCES_PGSQL_URL
# Please change = PIO_STORAGE_SOURCES_PGSQL_USERNAME and
# = PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
PIO_STORAGE_SOURCES_PGSQL_TYPE=3Djdbc
PIO_STORAGE_SOURCES_PGSQL_URL=3Djdbc:postgresql://localhost/pio
PIO_STORAGE_SOURCES_PGSQL_USERNAME=3Dpio
PIO_STORAGE_SOURCES_PGSQL_PASSWORD=3Dpio

# MySQL Example
# = PIO_STORAGE_SOURCES_MYSQL_TYPE=3Djdbc
# = PIO_STORAGE_SOURCES_MYSQL_URL=3Djdbc:mysql://localhost/pio
# = PIO_STORAGE_SOURCES_MYSQL_USERNAME=3Dpio
# = PIO_STORAGE_SOURCES_MYSQL_PASSWORD=3Dpio
# Elasticsearch Example
# = PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=3Delasticsearch
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=3Dlocalhost
# = PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=3D9200
# PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=3Dhttp
# = PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=3D$PIO_HOME/vendors/elasticsearch-5.2.1
# Elasticsearch 1.x Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=3Delasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=3DmyprojectES
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=3Dlocalhost
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=3D9300<= br class=3D"">PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=3D$PIO_HOME/vendors/elasticsearch-1.7.6

# Local File = System Example
PIO_STORAGE_SOURCES_LOCALFS_TYPE=3Dlocalfs
PIO_STORAGE_SOURCES_LOCALFS_PATH=3D$PIO_FS_BASEDIR/models

# = HBase Example
PIO_STORAGE_SOURCES_HBASE_TYPE=3Dhbase
PIO_STORAGE_SOURCES_HBASE_HOME=3D$PIO_HOME/vendors/hbase-1.2.6
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D

Error = message:


=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D
[ERROR] [TaskSetManager] Task = 2.0 in stage 10.0 (TID 24) had a not serializable result: = org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:
    - = object not serializable (class: org.apache.mahout.math.RandomAccessSparseVector, value: {3:1.0,2:1.0})
    - field = (class: scala.Tuple2, name: _2, type: class java.lang.Object)
    - = object (class scala.Tuple2, (2,{3:1.0,2:1.0})); not retrying
[ERROR] [TaskSetManager] Task 3.0 in stage 10.0 (TID 25) had = a not serializable result: org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:
    - = object not serializable (class: org.apache.mahout.math.RandomAccessSparseVector, value: {0:1.0,3:1.0})
    - field = (class: scala.Tuple2, name: _2, type: class java.lang.Object)
    - = object (class scala.Tuple2, (3,{0:1.0,3:1.0})); not retrying
[ERROR] [TaskSetManager] Task 1.0 in stage 10.0 (TID 23) had = a not serializable result: org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:
    - = object not serializable (class: org.apache.mahout.math.RandomAccessSparseVector, value: {1:1.0})
    - field = (class: scala.Tuple2, name: _2, type: class java.lang.Object)
    - = object (class scala.Tuple2, (1,{1:1.0})); not retrying
[ERROR] [TaskSetManager] Task 0.0 in stage 10.0 (TID 22) had = a not serializable result: org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:
    - = object not serializable (class: org.apache.mahout.math.RandomAccessSparseVector, value: {0:1.0})
    - field = (class: scala.Tuple2, name: _2, type: class java.lang.Object)
    - = object (class scala.Tuple2, (0,{0:1.0})); not retrying
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 2.0 in stage 10.0 = (TID 24) had a not serializable result: = org.apache.mahout.math.RandomAccessSparseVector
Serialization stack:
    - = object not serializable (class: org.apache.mahout.math.RandomAccessSparseVector, value: {3:1.0,2:1.0})
    - field = (class: scala.Tuple2, name: _2, type: class java.lang.Object)
    - = object (class scala.Tuple2, (2,{3:1.0,2:1.0}))
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
    at = org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
    at = org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
    at = scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at = scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at = org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
    at = org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
    at = org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
    at = scala.Option.foreach(Option.scala:236)
    at = org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
    at = org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
    at = org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
    at = org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
    at = org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at = org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
    at = org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
    at = org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
    at = org.apache.spark.rdd.RDD$$anonfun$fold$1.apply(RDD.scala:1088)
    at = org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at = org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
    at = org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
    at = org.apache.spark.rdd.RDD.fold(RDD.scala:1082)
    at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.computeNRow(CheckpointedDrmSpark.scala:188)
    at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.nrow$lzycompute(CheckpointedDrmSpark.scala:55)
    at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.nrow(CheckpointedDrmSpark.scala:55)
    at = org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.newRowCardinality(CheckpointedDrmSpark.scala:219)
    at = com.actionml.IndexedDatasetSpark$.apply(Preparator.scala:213)
    at = com.actionml.Preparator$$anonfun$3.apply(Preparator.scala:71)
    at = com.actionml.Preparator$$anonfun$3.apply(Preparator.scala:49)
    at = scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at = scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at = scala.collection.immutable.List.foreach(List.scala:318)    at = scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at = scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at = com.actionml.Preparator.prepare(Preparator.scala:49)
    at = com.actionml.Preparator.prepare(Preparator.scala:32)
    at = org.apache.predictionio.controller.PPreparator.prepareBase(PPreparator.scala:37)
    at = org.apache.predictionio.controller.Engine$.train(Engine.scala:671)
    at = org.apache.predictionio.controller.Engine.train(Engine.scala:177)
    at = org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
    at = org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:250)
    at = org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)    at = sun.reflect.NativeMethodAccessorImpl.invoke0(Native = Method)
    at = sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at = sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at = java.lang.reflect.Method.invoke(Method.java:498)
    at = org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)    at = org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at = org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at = org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at = org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D

Thank = you all for your help.

Best = regards,
noelia




-- 

Noelia = Os=C3=A9s Fern=C3=A1ndez, PhD
Senior Researcher |
Investigadora = Senior


noses@vicomtech.org
+[34] 943 30 92 30
Data Intelligence for Energy and
Industrial Processes | Inteligencia
de Datos = para Energ=C3=ADa y Procesos
Industriales


  

member of:     

Legal Notice - Privacy = policy
<= /div>






-- 

Noelia = Os=C3=A9s Fern=C3=A1ndez, PhD
Senior Researcher |
Investigadora = Senior


noses@vicomtech.org
+[34] 943 30 92 30
Data Intelligence for Energy and
Industrial Processes | Inteligencia
de Datos = para Energ=C3=ADa y Procesos
Industriales


  

member of:     

Legal Notice - Privacy = policy

<= span class=3D"HOEnZb">-- You received this message because you are subscribed to the = Google Groups "actionml-user" group.
To= unsubscribe from this group and stop receiving emails from it, send an = email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, = send email to actionml-user@googlegroups.com.
To= view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAMyseftewAGvt2_XPsRQrDvmFVti4sZLFkZZc_ygpB8k%2Bmjq4A%40mail.gmail.com.
For more options, = visit https://groups.google.com/d/optout.




-- 

Noelia = Os=C3=A9s Fern=C3=A1ndez, PhD
Senior Researcher |
Investigadora = Senior


noses@vicomtech.org
+[34] 943 30 92 30
Data Intelligence for Energy and
Industrial Processes | Inteligencia
de Datos = para Energ=C3=ADa y Procesos
Industriales


  

member of:     

Legal Notice - Privacy = policy
<= br class=3D"">
= --Apple-Mail=_83C68C4A-1E54-41E9-BC91-87C01A584974--