predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donald Szeto <don...@apache.org>
Subject Re: [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 had a not serializable result
Date Wed, 18 Oct 2017 15:49:12 GMT
Chiming in a bit. Looking at the serialization error, it looks like we are
just one little step away from getting this to work.

Noelia, what does your synthesized data look like? All data that is
processed by Spark needs to be serializable. At some point, a
non-serializable vector object showing in the stack is created out of your
synthesized data. It would be great to know what your input event looks
like and see where in the code path has caused this.

Regards,
Donald

On Tue, Oct 17, 2017 at 12:14 AM Noelia Osés Fernández <noses@vicomtech.org>
wrote:

> Pat, you mentioned the problem could be that the data I was using was too
> small. So now I'm using the attached data file as the data (4 users and 100
> items). But I'm still getting the same error. I'm sorry I forgot to mention
> I had increased the dataset.
>
> The reason why I want to make it work with a very small dataset is because
> I want to be able to follow the calculations. I want to understand what the
> UR is doing and understand the impact of changing this or that, here or
> there... I find that easier to achieve with a small example in which I know
> exactly what's happening. I want to build my trust on my understanding of
> the UR before I move on to applying it to a real problem. If I'm not
> confident that I know how to use it, how can I tell my client that the
> results I'm getting are good with any degree of confidence?
>
>
>
>
>
> On 16 October 2017 at 20:44, Pat Ferrel <pat@occamsmachete.com> wrote:
>
>> So all setup is the same for the integration-test and your modified test
>> *except the data*?
>>
>> The error looks like a setup problem because the serialization should
>> happen with either test. But if the only difference really is the data,
>> then toss it and use either real data or the integration test data, why are
>> you trying to synthesize fake data if it causes the error?
>>
>> BTW the data you include below in this thread would never create internal
>> IDs as high as 94 in the vector. You must have switched to a new dataset???
>>
>> I would get a dump of your data using `pio export` and make sure it’s
>> what you thought it was. You claim to have only 4 user ids and 4 item ids
>> but the serialized vector thinks you have at least 94 of user or item ids.
>> Something doesn’t add up.
>>
>>
>> On Oct 16, 2017, at 4:43 AM, Noelia Osés Fernández <noses@vicomtech.org>
>> wrote:
>>
>> Pat, you are absolutely right! I increased the sleep time and now the
>> integration test for handmade works perfectly.
>>
>> However, the integration test adapted to run with my tiny app runs into
>> the same problem I've been having with this app:
>>
>> [ERROR] [TaskSetManager] Task 1.0 in stage 10.0 (TID 23) had a not
>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>> Serialization stack:
>>     - object not serializable (class:
>> org.apache.mahout.math.RandomAccessSparseVector, value:
>> {66:1.0,29:1.0,70:1.0,91:1.0,58:1.0,37:1.0,13:1.0,8:1.0,94:1.0,30:1.0,57:1.0,22:1.0,20:1.0,35:1.0,97:1.0,60:1.0,27:1.0,72:1.0,3:1.0,34:1.0,77:1.0,46:1.0,81:1.0,86:1.0,43:1.0})
>>     - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
>>     - object (class scala.Tuple2,
>> (1,{66:1.0,29:1.0,70:1.0,91:1.0,58:1.0,37:1.0,13:1.0,8:1.0,94:1.0,30:1.0,57:1.0,22:1.0,20:1.0,35:1.0,97:1.0,60:1.0,27:1.0,72:1.0,3:1.0,34:1.0,77:1.0,46:1.0,81:1.0,86:1.0,43:1.0}));
>> not retrying
>> [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 (TID 24) had a not
>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>> Serialization stack:
>>
>> ...
>>
>> Any ideas?
>>
>> On 15 October 2017 at 19:09, Pat Ferrel <pat@occamsmachete.com> wrote:
>>
>>> This is probably a timing issue in the integration test, which has to
>>> wait for `pio deploy` to finish before the queries can be made. If it
>>> doesn’t finish the queries will fail. By the time the rest of the test
>>> quits the model has been deployed so you can run queries. In the
>>> integration-test script increase the delay after `pio deploy…` and see if
>>> it passes then.
>>>
>>> This is probably an integrtion-test script problem not a problem in the
>>> system
>>>
>>>
>>>
>>> On Oct 6, 2017, at 4:21 AM, Noelia Osés Fernández <noses@vicomtech.org>
>>> wrote:
>>>
>>> Pat,
>>>
>>> I have run the integration test for the handmade example out of
>>> curiosity. Strangely enough things go more or less as expected apart from
>>> the fact that I get a message saying:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *...[INFO] [CoreWorkflow$] Updating engine instance[INFO]
>>> [CoreWorkflow$] Training completed successfully.Model will remain deployed
>>> after this testWaiting 30 seconds for the server to startnohup: redirecting
>>> stderr to stdout  % Total    % Received % Xferd  Average Speed   Time
>>> Time     Time  Current                                 Dload  Upload
>>> Total   Spent    Left  Speed  0     0    0     0    0     0      0      0
>>> --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost
>>> port 8000: Connection refused*
>>> So the integration test does not manage to get the recommendations even
>>> though the model trained and deployed successfully. However, as soon as the
>>> integration test finishes, on the same terminal, I can get the
>>> recommendations by doing the following:
>>>
>>> $ curl -H "Content-Type: application/json" -d '
>>> > {
>>> >     "user": "u1"
>>> > }' http://localhost:8000/queries.json
>>>
>>> {"itemScores":[{"item":"Nexus","score":0.057719700038433075},{"item":"Surface","score":0.0}]}
>>>
>>> Isn't this odd? Can you guess what's going on?
>>>
>>> Thank you very much for all your support!
>>> noelia
>>>
>>>
>>>
>>> On 5 October 2017 at 19:22, Pat Ferrel <pat@occamsmachete.com> wrote:
>>>
>>>> Ok, that config should work. Does the integration test pass?
>>>>
>>>> The data you are using is extremely small and though it does look like
>>>> it has cooccurrences, they may not meet minimum “big-data” thresholds
used
>>>> by default. Try adding more data or use the handmade example data, rename
>>>> purchase to view and discard the existing view data if you wish.
>>>>
>>>> The error is very odd and I’ve never seen it. If the integration test
>>>> works I can only surmise it's your data.
>>>>
>>>>
>>>> On Oct 5, 2017, at 12:02 AM, Noelia Osés Fernández <noses@vicomtech.org>
>>>> wrote:
>>>>
>>>> SPARK: spark-1.6.3-bin-hadoop2.6
>>>>
>>>> PIO: 0.11.0-incubating
>>>>
>>>> Scala: whatever gets installed when installing PIO 0.11.0-incubating, I
>>>> haven't installed Scala separately
>>>>
>>>> UR: ActionML's UR v0.6.0 I suppose as that's the last version mentioned
>>>> in the readme file. I have attached the UR zip file I downloaded from the
>>>> actionml github account.
>>>>
>>>> Thank you for your help!!
>>>>
>>>> On 4 October 2017 at 17:20, Pat Ferrel <pat@occamsmachete.com> wrote:
>>>>
>>>>> What version of Scala. Spark, PIO, and UR are you using?
>>>>>
>>>>>
>>>>> On Oct 4, 2017, at 6:10 AM, Noelia Osés Fernández <noses@vicomtech.org>
>>>>> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm still trying to create a very simple app to learn to use
>>>>> PredictionIO and still having trouble. I have done pio build no problem.
>>>>> But when I do pio train I get a very long error message related to
>>>>> serialisation (error message copied below).
>>>>>
>>>>> pio status reports system is all ready to go.
>>>>>
>>>>> The app I'm trying to build is very simple, it only has 'view' events.
>>>>> Here's the engine.json:
>>>>>
>>>>> *===========================================================*
>>>>> {
>>>>>   "comment":" This config file uses default settings for all but the
>>>>> required values see README.md for docs",
>>>>>   "id": "default",
>>>>>   "description": "Default settings",
>>>>>   "engineFactory": "com.actionml.RecommendationEngine",
>>>>>   "datasource": {
>>>>>     "params" : {
>>>>>       "name": "tiny_app_data.csv",
>>>>>       "appName": "TinyApp",
>>>>>       "eventNames": ["view"]
>>>>>     }
>>>>>   },
>>>>>   "algorithms": [
>>>>>     {
>>>>>       "comment": "simplest setup where all values are default,
>>>>> popularity based backfill, must add eventsNames",
>>>>>       "name": "ur",
>>>>>       "params": {
>>>>>         "appName": "TinyApp",
>>>>>         "indexName": "urindex",
>>>>>         "typeName": "items",
>>>>>         "comment": "must have data for the first event or the model
>>>>> will not build, other events are optional",
>>>>>         "eventNames": ["view"]
>>>>>       }
>>>>>     }
>>>>>   ]
>>>>> }
>>>>> *===========================================================*
>>>>>
>>>>> The data I'm using is:
>>>>>
>>>>> "u1","i1"
>>>>> "u2","i1"
>>>>> "u2","i2"
>>>>> "u3","i2"
>>>>> "u3","i3"
>>>>> "u4","i4"
>>>>>
>>>>> meaning user u viewed item i.
>>>>>
>>>>> The data has been added to the database with the following python code:
>>>>>
>>>>> *===========================================================*
>>>>> """
>>>>> Import sample data for recommendation engine
>>>>> """
>>>>>
>>>>> import predictionio
>>>>> import argparse
>>>>> import random
>>>>>
>>>>> RATE_ACTIONS_DELIMITER = ","
>>>>> SEED = 1
>>>>>
>>>>>
>>>>> def import_events(client, file):
>>>>>   f = open(file, 'r')
>>>>>   random.seed(SEED)
>>>>>   count = 0
>>>>>   print "Importing data..."
>>>>>
>>>>>   items = []
>>>>>   users = []
>>>>>   f = open(file, 'r')
>>>>>   for line in f:
>>>>>     data = line.rstrip('\r\n').split(RATE_ACTIONS_DELIMITER)
>>>>>     users.append(data[0])
>>>>>     items.append(data[1])
>>>>>     client.create_event(
>>>>>       event="view",
>>>>>       entity_type="user",
>>>>>       entity_id=data[0],
>>>>>       target_entity_type="item",
>>>>>       target_entity_id=data[1]
>>>>>     )
>>>>>     print "Event: " + "view" + " entity_id: " + data[0] + "
>>>>> target_entity_id: " + data[1]
>>>>>     count += 1
>>>>>   f.close()
>>>>>
>>>>>   users = set(users)
>>>>>   items = set(items)
>>>>>   print "All users: " + str(users)
>>>>>   print "All items: " + str(items)
>>>>>   for item in items:
>>>>>     client.create_event(
>>>>>       event="$set",
>>>>>       entity_type="item",
>>>>>       entity_id=item
>>>>>     )
>>>>>     count += 1
>>>>>
>>>>>
>>>>>   print "%s events are imported." % count
>>>>>
>>>>>
>>>>> if __name__ == '__main__':
>>>>>   parser = argparse.ArgumentParser(
>>>>>     description="Import sample data for recommendation engine")
>>>>>   parser.add_argument('--access_key', default='invald_access_key')
>>>>>   parser.add_argument('--url', default="http://localhost:7070")
>>>>>   parser.add_argument('--file', default="./data/tiny_app_data.csv")
>>>>>
>>>>>   args = parser.parse_args()
>>>>>   print args
>>>>>
>>>>>   client = predictionio.EventClient(
>>>>>     access_key=args.access_key,
>>>>>     url=args.url,
>>>>>     threads=5,
>>>>>     qsize=500)
>>>>>   import_events(client, args.file)
>>>>> *===========================================================*
>>>>>
>>>>> My pio_env.sh is the following:
>>>>>
>>>>> *===========================================================*
>>>>> #!/usr/bin/env bash
>>>>> #
>>>>> # Copy this file as pio-env.sh and edit it for your site's
>>>>> configuration.
>>>>> #
>>>>> # Licensed to the Apache Software Foundation (ASF) under one or more
>>>>> # contributor license agreements.  See the NOTICE file distributed with
>>>>> # this work for additional information regarding copyright ownership.
>>>>> # The ASF licenses this file to You under the Apache License, Version
>>>>> 2.0
>>>>> # (the "License"); you may not use this file except in compliance with
>>>>> # the License.  You may obtain a copy of the License at
>>>>> #
>>>>> #    http://www.apache.org/licenses/LICENSE-2.0
>>>>> #
>>>>> # Unless required by applicable law or agreed to in writing, software
>>>>> # distributed under the License is distributed on an "AS IS" BASIS,
>>>>> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
>>>>> implied.
>>>>> # See the License for the specific language governing permissions and
>>>>> # limitations under the License.
>>>>> #
>>>>>
>>>>> # PredictionIO Main Configuration
>>>>> #
>>>>> # This section controls core behavior of PredictionIO. It is very
>>>>> likely that
>>>>> # you need to change these to fit your site.
>>>>>
>>>>> # SPARK_HOME: Apache Spark is a hard dependency and must be configured.
>>>>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>>>>> SPARK_HOME=$PIO_HOME/vendors/spark-1.6.3-bin-hadoop2.6
>>>>>
>>>>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.1.4.jar
>>>>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>>>>
>>>>> # ES_CONF_DIR: You must configure this if you have advanced
>>>>> configuration for
>>>>> #              your Elasticsearch setup.
>>>>> # ES_CONF_DIR=/opt/elasticsearch
>>>>> #ES_CONF_DIR=$PIO_HOME/vendors/elasticsearch-1.7.6
>>>>>
>>>>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>>>>> PredictionIO
>>>>> #                  with Hadoop 2.
>>>>> # HADOOP_CONF_DIR=/opt/hadoop
>>>>>
>>>>> # HBASE_CONF_DIR: You must configure this if you intend to run
>>>>> PredictionIO
>>>>> #                 with HBase on a remote cluster.
>>>>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>>>>
>>>>> # Filesystem paths where PredictionIO uses as block storage.
>>>>> PIO_FS_BASEDIR=$HOME/.pio_store
>>>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>>>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>>>>
>>>>> # PredictionIO Storage Configuration
>>>>> #
>>>>> # This section controls programs that make use of PredictionIO's
>>>>> built-in
>>>>> # storage facilities. Default values are shown below.
>>>>> #
>>>>> # For more information on storage configuration please refer to
>>>>> # http://predictionio.incubator.apache.org/system/anotherdatastore/
>>>>>
>>>>> # Storage Repositories
>>>>>
>>>>> # Default is to use PostgreSQL
>>>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>>>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>>>>
>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>>>>
>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>>>>
>>>>> # Storage Data Sources
>>>>>
>>>>> # PostgreSQL Default Settings
>>>>> # Please change "pio" to your database name in
>>>>> PIO_STORAGE_SOURCES_PGSQL_URL
>>>>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>>>>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>>>>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>>>>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>>>>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>>>>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>>>>
>>>>> # MySQL Example
>>>>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>>>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>>>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>>>>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>>>>
>>>>> # Elasticsearch Example
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>>>> #
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.2.1
>>>>> # Elasticsearch 1.x Example
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=myprojectES
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>>>>>
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6
>>>>>
>>>>> # Local File System Example
>>>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>>>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>>>>
>>>>> # HBase Example
>>>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>>>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>>>>
>>>>>
>>>>> *===========================================================Error
>>>>> message:*
>>>>>
>>>>> *===========================================================*
>>>>> [ERROR] [TaskSetManager] Task 2.0 in stage 10.0 (TID 24) had a not
>>>>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>>>>> Serialization stack:
>>>>>     - object not serializable (class:
>>>>> org.apache.mahout.math.RandomAccessSparseVector, value: {3:1.0,2:1.0})
>>>>>     - field (class: scala.Tuple2, name: _2, type: class
>>>>> java.lang.Object)
>>>>>     - object (class scala.Tuple2, (2,{3:1.0,2:1.0})); not retrying
>>>>> [ERROR] [TaskSetManager] Task 3.0 in stage 10.0 (TID 25) had a not
>>>>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>>>>> Serialization stack:
>>>>>     - object not serializable (class:
>>>>> org.apache.mahout.math.RandomAccessSparseVector, value: {0:1.0,3:1.0})
>>>>>     - field (class: scala.Tuple2, name: _2, type: class
>>>>> java.lang.Object)
>>>>>     - object (class scala.Tuple2, (3,{0:1.0,3:1.0})); not retrying
>>>>> [ERROR] [TaskSetManager] Task 1.0 in stage 10.0 (TID 23) had a not
>>>>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>>>>> Serialization stack:
>>>>>     - object not serializable (class:
>>>>> org.apache.mahout.math.RandomAccessSparseVector, value: {1:1.0})
>>>>>     - field (class: scala.Tuple2, name: _2, type: class
>>>>> java.lang.Object)
>>>>>     - object (class scala.Tuple2, (1,{1:1.0})); not retrying
>>>>> [ERROR] [TaskSetManager] Task 0.0 in stage 10.0 (TID 22) had a not
>>>>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>>>>> Serialization stack:
>>>>>     - object not serializable (class:
>>>>> org.apache.mahout.math.RandomAccessSparseVector, value: {0:1.0})
>>>>>     - field (class: scala.Tuple2, name: _2, type: class
>>>>> java.lang.Object)
>>>>>     - object (class scala.Tuple2, (0,{0:1.0})); not retrying
>>>>> Exception in thread "main" org.apache.spark.SparkException: Job
>>>>> aborted due to stage failure: Task 2.0 in stage 10.0 (TID 24) had a not
>>>>> serializable result: org.apache.mahout.math.RandomAccessSparseVector
>>>>> Serialization stack:
>>>>>     - object not serializable (class:
>>>>> org.apache.mahout.math.RandomAccessSparseVector, value: {3:1.0,2:1.0})
>>>>>     - field (class: scala.Tuple2, name: _2, type: class
>>>>> java.lang.Object)
>>>>>     - object (class scala.Tuple2, (2,{3:1.0,2:1.0}))
>>>>>     at org.apache.spark.scheduler.DAGScheduler.org
>>>>> <http://org.apache.spark.scheduler.dagscheduler.org/>
>>>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
>>>>>     at
>>>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>>>>     at
>>>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>>>>>     at scala.Option.foreach(Option.scala:236)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
>>>>>     at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>>>>>     at
>>>>> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
>>>>>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
>>>>>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
>>>>>     at org.apache.spark.rdd.RDD$$anonfun$fold$1.apply(RDD.scala:1088)
>>>>>     at
>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>>>>     at
>>>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>>>>>     at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
>>>>>     at org.apache.spark.rdd.RDD.fold(RDD.scala:1082)
>>>>>     at org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.com
>>>>> <http://s.drm.checkpointeddrmspark.com/>
>>>>> puteNRow(CheckpointedDrmSpark.scala:188)
>>>>>     at
>>>>> org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.nrow$lzycompute(CheckpointedDrmSpark.scala:55)
>>>>>     at
>>>>> org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.nrow(CheckpointedDrmSpark.scala:55)
>>>>>     at
>>>>> org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark.newRowCardinality(CheckpointedDrmSpark.scala:219)
>>>>>     at com.actionml.IndexedDatasetSpark$.apply(Preparator.scala:213)
>>>>>     at com.actionml.Preparator$$anonfun$3.apply(Preparator.scala:71)
>>>>>     at com.actionml.Preparator$$anonfun$3.apply(Preparator.scala:49)
>>>>>     at
>>>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>>>>     at
>>>>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>>>>     at scala.collection.immutable.List.foreach(List.scala:318)
>>>>>     at
>>>>> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>>>>>     at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>>>>>     at com.actionml.Preparator.prepare(Preparator.scala:49)
>>>>>     at com.actionml.Preparator.prepare(Preparator.scala:32)
>>>>>     at
>>>>> org.apache.predictionio.controller.PPreparator.prepareBase(PPreparator.scala:37)
>>>>>     at
>>>>> org.apache.predictionio.controller.Engine$.train(Engine.scala:671)
>>>>>     at
>>>>> org.apache.predictionio.controller.Engine.train(Engine.scala:177)
>>>>>     at
>>>>> org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
>>>>>     at
>>>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:250)
>>>>>     at
>>>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>     at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>     at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>     at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>     at
>>>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>>>>>     at
>>>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>>>>>     at
>>>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>>>>>     at
>>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>>>>>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>>>
>>>>> *===========================================================*
>>>>> Thank you all for your help.
>>>>>
>>>>> Best regards,
>>>>> noelia
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> <http://www.vicomtech.org/>
>>>>
>>>> Noelia Osés Fernández, PhD
>>>> Senior Researcher |
>>>> Investigadora Senior
>>>>
>>>> noses@vicomtech.org
>>>> +[34] 943 30 92 30
>>>> Data Intelligence for Energy and
>>>> Industrial Processes | Inteligencia
>>>> de Datos para Energía y Procesos
>>>> Industriales
>>>>
>>>> <https://www.linkedin.com/company/vicomtech>
>>>> <https://www.youtube.com/user/VICOMTech>
>>>> <https://twitter.com/@Vicomtech_IK4>
>>>>
>>>> member of:  <http://www.graphicsmedia.net/>     <http://www.ik4.es/>
>>>>
>>>> Legal Notice - Privacy policy
>>>> <http://www.vicomtech.org/en/proteccion-datos>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> <http://www.vicomtech.org/>
>>>
>>> Noelia Osés Fernández, PhD
>>> Senior Researcher |
>>> Investigadora Senior
>>>
>>> noses@vicomtech.org
>>> +[34] 943 30 92 30
>>> Data Intelligence for Energy and
>>> Industrial Processes | Inteligencia
>>> de Datos para Energía y Procesos
>>> Industriales
>>>
>>> <https://www.linkedin.com/company/vicomtech>
>>> <https://www.youtube.com/user/VICOMTech>
>>> <https://twitter.com/@Vicomtech_IK4>
>>>
>>> member of:  <http://www.graphicsmedia.net/>     <http://www.ik4.es/>
>>>
>>> Legal Notice - Privacy policy
>>> <http://www.vicomtech.org/en/proteccion-datos>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "actionml-user" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to actionml-user+unsubscribe@googlegroups.com.
>>> To post to this group, send email to actionml-user@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/actionml-user/CAMyseftewAGvt2_XPsRQrDvmFVti4sZLFkZZc_ygpB8k%2Bmjq4A%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/actionml-user/CAMyseftewAGvt2_XPsRQrDvmFVti4sZLFkZZc_ygpB8k%2Bmjq4A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>
>>
>> --
>> <http://www.vicomtech.org/>
>>
>> Noelia Osés Fernández, PhD
>> Senior Researcher |
>> Investigadora Senior
>>
>> noses@vicomtech.org
>> +[34] 943 30 92 30
>> Data Intelligence for Energy and
>> Industrial Processes | Inteligencia
>> de Datos para Energía y Procesos
>> Industriales
>>
>> <https://www.linkedin.com/company/vicomtech>
>> <https://www.youtube.com/user/VICOMTech>
>> <https://twitter.com/@Vicomtech_IK4>
>>
>> member of:  <http://www.graphicsmedia.net/>     <http://www.ik4.es/>
>>
>> Legal Notice - Privacy policy
>> <http://www.vicomtech.org/en/proteccion-datos>
>>
>>
>
>
> --
> <http://www.vicomtech.org>
>
> Noelia Osés Fernández, PhD
> Senior Researcher |
> Investigadora Senior
>
> noses@vicomtech.org
> +[34] 943 30 92 30
> Data Intelligence for Energy and
> Industrial Processes | Inteligencia
> de Datos para Energía y Procesos
> Industriales
>
> <https://www.linkedin.com/company/vicomtech>
> <https://www.youtube.com/user/VICOMTech>
> <https://twitter.com/@Vicomtech_IK4>
>
> member of:  <http://www.graphicsmedia.net/>     <http://www.ik4.es>
>
> Legal Notice - Privacy policy
> <http://www.vicomtech.org/en/proteccion-datos>
>

Mime
View raw message