predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: pio train error java.lang.NegativeArraySizeException
Date Wed, 29 Mar 2017 18:26:15 GMT
yes

There is very little validation of events done by PredictionIO since they are Template specific
and the EventSever is not. Any usage event that does not have "entityType": “user” is
ignored so ti the UR you have not data.

Also user properties encoded in this way are ignored. User data should be encoded as some
kind of preference indicator and sent as a named usage event. Location can be used as a preference
indicator but I’d choose one type of granularity like postal code, and it needs special
setup since the downsampling thresholds assume you will have many possible values for any
indicator. This can be configured but you have to ask yourself; “do I really think user
location is going to be important"


On Mar 28, 2017, at 11:05 AM, Haddix, Steven <Steven.Haddix@wendys.com> wrote:

SOLVED: It appears universal recommendation engine requires the event entity types to equal
“user”.

From: Microsoft Office User <Steven.Haddix@wendys.com <mailto:Steven.Haddix@wendys.com>>
Reply-To: <user@predictionio.incubator.apache.org <mailto:user@predictionio.incubator.apache.org>>
Date: Tue, 28 Mar 2017 16:38:45 +0000
To: "user@predictionio.incubator.apache.org <mailto:user@predictionio.incubator.apache.org>"
<user@predictionio.incubator.apache.org <mailto:user@predictionio.incubator.apache.org>>
Subject: pio train error java.lang.NegativeArraySizeException

I know this has been posted a few times, but I can't seem to get around this error. I’m
not sure what I’m missing but any help is appreciated.

I’m using the Universal Recommendation Engine

It appears I have data loaded with event an type that matches the engine.json.

When training my data I get the following error:

Error

[WARN] [TaskSetManager] Lost task 0.0 in stage 13.0 (TID 9, localhost): java.lang.NegativeArraySizeException
        at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:57)
        at org.apache.mahout.sparkbindings.SparkEngine$$anonfun$5.apply(SparkEngine.scala:78)
        at org.apache.mahout.sparkbindings.SparkEngine$$anonfun$5.apply(SparkEngine.scala:77)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

My engine.json is:
 {
  "comment":" This config file uses default settings for all but the required values see README.md
for docs",
  "id": "default",
  "description": "Default settings",
  "engineFactory": "com.wendys.RecommendationEngine",
  "datasource": {
    "params" : {
      "name": "sample-handmade-data.txt",
      "appName": "Ordering",
      "eventNames": ["buy"]
    }
  },
  "sparkConf": {
    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
    "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
    "spark.kryo.referenceTracking": "false",
    "spark.kryoserializer.buffer": "300m",
    "spark.executor.memory": "4g",
    "es.index.auto.create": "true"
  },
  "algorithms": [
    {
      "comment": "simplest setup where all values are default, popularity based backfill,
must add eventsNames",
      "name": "ur",
      "params": {
        "appName": "Ordering",
        "indexName": "urindex",
        "typeName": "items",
        "comment": "must have data for the first event or the model will not build, other
events are optional",
        "eventNames": ["buy"]
      }
    }
  ]
}

Sample event from "pio export --appid 3 --format json"
{
  "eventId": "__RjfodLTziIR9c6tFXyCwAAAVecf1DgqYK6F_BBYGQ",
  "event": "buy",
  "entityType": "customer",
  "entityId": "3927685",
  "targetEntityType": "item",
  "targetEntityId": "1",
  "properties": {
    "city": [
      "<removed>"
    ],
    "state": [
      "<removed>"
    ],
    "zip": [
      <removed>
    ],
    "country": [
      "US"
    ]
  },
  "eventTime": "2016-10-07T00:16:12.000Z",
  "creationTime": "2017-03-28T14:20:14.177Z"
}

Pio status result

[INFO] [Console$] Inspecting PredictionIO… 
[INFO] [Console$] PredictionIO 0.9.6 is installed at /PredictionIO  
[INFO] [Console$] Inspecting Apache Spark…  
[INFO] [Console$] Apache Spark is installed at /PredictionIO/vendors/spark-1.6.2-bin-hadoop2.6
 
[INFO] [Console$] Apache Spark 1.6.2 detected (meets minimum requirement of 1.3.0) 
[INFO] [Console$] Inspecting storage backend connections…  
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)… 
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)…  
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)…  
[INFO] [Storage$] Test writing to Event Store (App Id 0)…  
[INFO] [HBLEvents] The table pio_event:events_0 doesnt exist yet. Creating now…  
[INFO] [HBLEvents] Removing table pio_event:events_0…  
[INFO] [Console$] (sleeping 5 seconds for all messages to show up…)  
[INFO] [Console$] Your system is all ready to go.


Notice: This e-mail message and its attachments are the property of The Wendy's Company or
one of its subsidiaries and may contain confidential or legally privileged information intended
solely for the use of the addressee(s). If you are not an intended recipient, then any use,
copying or distribution of this message or its attachments is strictly prohibited. If you
received this message in error, please notify the sender and delete this message entirely
from your system.
Notice: This e-mail message and its attachments are the property of The Wendy's Company or
one of its subsidiaries and may contain confidential or legally privileged information intended
solely for the use of the addressee(s). If you are not an intended recipient, then any use,
copying or distribution of this message or its attachments is strictly prohibited. If you
received this message in error, please notify the sender and delete this message entirely
from your system.


Mime
View raw message