predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Graham <m...@teyanlogic.se>
Subject Re: pio Train error
Date Fri, 02 Dec 2016 18:11:05 GMT
and this is what i have 

root@server1 [/MyRecommendation]# jps -l
36416 org.apache.predictionio.tools.console.Console
32592 /home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/sbt/sbt-launch-0.13.7.jar
18641 org.apache.predictionio.tools.console.Console
38353 sun.tools.jps.Jps
32485 org.apache.predictionio.tools.console.Console
18262 org.apache.hadoop.hbase.master.HMaster
18168 org.elasticsearch.bootstrap.Elasticsearch
35082 org.apache.predictionio.tools.console.Console

Could someone give me some idea as to how to configure this? I cannot find anything in the
documentation that I haven’t already done



> On 2 Dec 2016, at 17:51, Mike Graham <mike@teyanlogic.se> wrote:
> 
> I am following this quick start guide to the letter but I get the same error every time.
The build succeeds but the pio train fails. 
> 
> I have one event.
> 
> Any help is much appreciated .
> 
> 
> root@server1 [/MyRecommendation]# curl -i -X GET "http://localhost:7070/events.json?accessKey=Y4Wc0_GqS1q6PU5prUnj62g_3rWBkM2a7VyjZ3BgGLYj3hvvsV99lSvRjL9gpU3w
<http://localhost:7070/events.json?accessKey=Y4Wc0_GqS1q6PU5prUnj62g_3rWBkM2a7VyjZ3BgGLYj3hvvsV99lSvRjL9gpU3w>"
> HTTP/1.1 200 OK
> Server: spray-can/1.3.3
> Date: Fri, 02 Dec 2016 16:45:38 GMT
> Content-Type: application/json; charset=UTF-8
> Content-Length: 259
> 
> [{"eventId":"944c921fa02046589d6cc0dbfd287b5e","event":"rate","entityType":"user","entityId":"u0","targetEntityType":"item","targetEntityId":"i0","properties":{"rating":5},"eventTime":"2014-11-02T09:39:45.618-08:00","creationTime":"2016-12-02T16:43:48.555Z"}]root@server1
[/MyRecommendation]# vi engine.json
> root@server1 [/MyRecommendation]# pio build
> 
> 
> 
> [INFO] [Console$] Using command '/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/sbt/sbt'
at the current working directory to build.
> [INFO] [Console$] If the path above is incorrect, this process will fail.
> [INFO] [Console$] Uber JAR disabled. Making sure lib/pio-assembly-0.10.0-incubating.jar
is absent.
> [INFO] [Console$] Going to run: /home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/sbt/sbt
 package assemblyPackageDependency
> [INFO] [Console$] Build finished successfully.
> [INFO] [Console$] Looking for an engine...
> [INFO] [Console$] Found template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar
> [INFO] [Console$] Found template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar
> [INFO] [RegisterEngine$] Registering engine paKnEN71bFYT99z9l76bMmCYcNRz61q2 6a2841579d2c44559cafedbf97d19cb57b37eec2
> [INFO] [Console$] Your engine is ready for training.
> 
> 
> root@server1 [/MyRecommendation]# pio train
> [INFO] [Console$] Using existing engine manifest JSON at /MyRecommendation/manifest.json
> [INFO] [Runner$] Submission command: /home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/vendors/spark-1.5.1-bin-hadoop2.6/bin/spark-submit
--class org.apache.predictionio.workflow.CreateWorkflow --jars file:/MyRecommendation/target/scala-2.10/template-scala-parallel-recommendation_2.10-0.1-SNAPSHOT.jar,file:/MyRecommendation/target/scala-2.10/template-scala-parallel-recommendation-assembly-0.1-SNAPSHOT-deps.jar
--files file:/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/conf/log4j.properties
--driver-class-path /home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/conf:/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/lib/postgresql-9.4-1204.jdbc41.jar:/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/lib/mysql-connector-java-5.1.37.jar
file:/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/lib/pio-assembly-0.10.0-incubating.jar
--engine-id paKnEN71bFYT99z9l76bMmCYcNRz61q2 --engine-version 6a2841579d2c44559cafedbf97d19cb57b37eec2
--engine-variant file:/MyRecommendation/engine.json --verbosity 0 --json-extractor Both --env
PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/root/.pio_store,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost,PIO_STORAGE_SOURCES_HBASE_HOME=/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/vendors/hbase-1.0.0,PIO_HOME=/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating,PIO_FS_ENGINESDIR=/root/.pio_store/engines,PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.4.4,PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL,PIO_CONF_DIR=/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
<postgresql://localhost/pio,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.4.4,PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc,PIO_FS_TMPDIR=/root/.pio_store/tmp,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL,PIO_CONF_DIR=/home/aml/apache-predictionio-0.10.0-incubating/PredictionIO-0.10.0-incubating/conf,PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300>
> [INFO] [Engine] Extracting datasource params...
> [INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
> [INFO] [Engine] Datasource params: (,DataSourceParams(MyApp1,None))
> [INFO] [Engine] Extracting preparator params...
> [INFO] [Engine] Preparator params: (,Empty)
> [INFO] [Engine] Extracting serving params...
> [INFO] [Engine] Serving params: (,Empty)
> [INFO] [Remoting] Starting remoting
> [INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver@82.102.5.151:41379
<akka.tcp://sparkDriver@82.102.5.151:41379>]
> [WARN] [MetricsSystem] Using default name DAGScheduler for source because spark.app.id
is not set.
> [INFO] [Engine$] EngineWorkflow.train
> [INFO] [Engine$] DataSource: com.iqchef.DataSource@5dfe23e8
> [INFO] [Engine$] Preparator: com.iqchef.Preparator@1989e8c6
> [INFO] [Engine$] AlgorithmList: List(com.iqchef.ALSAlgorithm@67d32a54)
> [INFO] [Engine$] Data sanity check is on.
> [INFO] [Engine$] com.iqchef.TrainingData does not support data sanity check. Skipping
check.
> [INFO] [Engine$] com.iqchef.PreparedData does not support data sanity check. Skipping
check.
> [WARN] [QueuedThreadPool] 3 threads could not be stopped
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage
failure: Task serialization failed: java.lang.reflect.InvocationTargetException
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:67)
> org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
> org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:80)
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
> org.apache.spark.SparkContext.broadcast(SparkContext.scala:1327)
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:861)
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:772)
> org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:757)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1466)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
> 	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> 	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> 	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:871)
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:772)
> 	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:757)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1466)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> 	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1822)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1835)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1848)
> 	at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1298)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> 	at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
> 	at org.apache.spark.rdd.RDD.take(RDD.scala:1272)
> 	at com.iqchef.ALSAlgorithm.train(ALSAlgorithm.scala:35)
> 	at com.iqchef.ALSAlgorithm.train(ALSAlgorithm.scala:22)
> 	at org.apache.predictionio.controller.PAlgorithm.trainBase(PAlgorithm.scala:50)
> 	at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:692)
> 	at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:692)
> 	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> 	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> 	at scala.collection.immutable.List.foreach(List.scala:318)
> 	at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> 	at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> 	at org.apache.predictionio.controller.Engine$.train(Engine.scala:692)
> 	at org.apache.predictionio.controller.Engine.train(Engine.scala:177)
> 	at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
> 	at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:250)
> 	at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
> 	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> 	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.reflect.InvocationTargetException
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> 	at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:67)
> 	at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
> 	at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
> 	at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:80)
> 	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
> 	at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:63)
> 	at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1327)
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:861)
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:772)
> 	at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:757)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1466)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> 	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> Caused by: java.lang.IllegalArgumentException: java.lang.UnsatisfiedLinkError: /tmp/snappy-unknown-16ed97d8-95ed-4f53-b497-fafa14223e36-libsnappyjava.so:
/tmp/snappy-unknown-16ed97d8-95ed-4f53-b497-fafa14223e36-libsnappyjava.so: failed to map segment
from shared object: Operation not permitted
> 	at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:151)
> 	... 18 more
> Caused by: java.lang.UnsatisfiedLinkError: /tmp/snappy-unknown-16ed97d8-95ed-4f53-b497-fafa14223e36-libsnappyjava.so:
/tmp/snappy-unknown-16ed97d8-95ed-4f53-b497-fafa14223e36-libsnappyjava.so: failed to map segment
from shared object: Operation not permitted
> 	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
> 	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
> 	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
> 	at java.lang.Runtime.load0(Runtime.java:809)
> 	at java.lang.System.load(System.java:1086)
> 	at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:166)
> 	at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:145)
> 	at org.xerial.snappy.Snappy.<clinit>(Snappy.java:47)
> 	at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:149)
> 	... 18 more
> root@server1 [/MyRecommendation]# 
> 
> 
> 
>> On 2 Dec 2016, at 13:08, Mike Graham <mike@teyanlogic.se <mailto:mike@teyanlogic.se>>
wrote:
>> 
>> Hi, thanks for the replay. 
>> 
>> I am following the instructions exactly as explained in http://predictionio.incubator.apache.org/templates/recommendation/quickstart/
<http://predictionio.incubator.apache.org/templates/recommendation/quickstart/>
>> 
>> 
>> My engine.json looks like this. 
>> 
>> {
>>   "id": "default",
>>   "description": "Default settings",
>>   "engineFactory": "com.iqchef.RecommendationEngine",
>>   "datasource": {
>>     "params" : {
>>       "appName": "MyTestApp"
>>     }
>>   },
>>   "algorithms": [
>>     {
>>       "name": "als",
>>       "params": {
>>         "rank": 10,
>>         "numIterations": 20,
>>         "lambda": 0.01,
>>         "seed": 3
>>       }
>>     }
>>   ]
>> }
>> 
>> 
>> 
>> Is this correct?
>> 
>> 
>> 
>> 
>>> On 2 Dec 2016, at 13:02, Natu Lauchande <nlauchande@gmail.com <mailto:nlauchande@gmail.com>>
wrote:
>>> 
>>> Hi,
>>> 
>>> I add similar errors when the event data was not fully  in there as expected
during the spark operations. Double check that your training data format is compatible with
what's described in the DataSource. 
>>> 
>>> Thanks,
>>> Natu
>> 
> 


Mime
View raw message