predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florian Krause <florian.kra...@rebelle.com>
Subject Change of handling of env variables in 0.11?
Date Mon, 22 May 2017 08:53:05 GMT
Hi all

I have been unsuccessful at building my two engines with 0.11. I have described my attempts
here -> https://stackoverflow.com/questions/43941915/predictionio-0-11-building-an-engine-fails-with-java-lang-classnotfoundexceptio

It appears that during the pio build phase, the env vars from pio-env.sh are not set correctly. 

I have managed to get around this by not running the tests, the compiled versions of the engine
work flawless, so the database works.

Now what confuses me a bit is the usage of the —env command line param in the CreateWorkflow
jar. 

This is the command pio sends to spark

/opt/PredictionIO-0.11.0-incubating/vendors/spark-2.1.1-bin-hadoop2.7/bin/spark-submit --driver-memory
80G --executor-memory 80G --class org.apache.predictionio.workflow.CreateWorkflow --jars file:/opt/PredictionIO-0.11.0-incubating/lib/postgresql-42.1.1.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/mysql-connector-java-5.1.40-bin.jar,file:/opt/reco-engine/MatrixProduct2/target/scala-2.11/matrixproduct2_2.11-0.1-SNAPSHOT.jar,file:/opt/reco-engine/MatrixProduct2/target/scala-2.11/MatrixProduct2-assembly-0.1-SNAPSHOT-deps.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-localfs-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hdfs-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-jdbc-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-elasticsearch-assembly-0.11.0-incubating.jar,file:/opt/PredictionIO-0.11.0-incubating/lib/spark/pio-data-hbase-assembly-0.11.0-incubating.jar
--files file:/opt/PredictionIO-0.11.0-incubating/conf/log4j.properties --driver-class-path
/opt/PredictionIO-0.11.0-incubating/conf:/opt/PredictionIO-0.11.0-incubating/lib/postgresql-42.1.1.jar:/opt/PredictionIO-0.11.0-incubating/lib/mysql-connector-java-5.1.40-bin.jar
--driver-java-options -Dpio.log.dir=/home/pio file:/opt/PredictionIO-0.11.0-incubating/lib/pio-assembly-0.11.0-incubating.jar
--engine-id com.rebelle.MatrixProduct2.ECommerceRecommendationEngine --engine-version 23bea44eff1a8e08bc80e290e52dc9dc565d9bb7
--engine-variant file:/opt/reco-engine/MatrixProduct2/engine.json --verbosity 0 --json-extractor
Both --env PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_HOME=/opt/PredictionIO-0.11.0-incubating,PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=PGSQL,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_PGSQL_PASSWORD=<password>,PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc,PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=PGSQL,PIO_CONF_DIR=/opt/PredictionIO-0.11.0-incubating/conf


When I try to run this manually from the command line, I get

[ERROR] [Storage$] Error initializing storage client for source
Exception in thread "main" org.apache.predictionio.data.storage.StorageClientException: Data
source  was not properly initialized.
        at org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:285)
        at org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:285)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:284)


So even though all needed params are set in —env, Spark cannot find them. I have to manually
set them via export to make this work. What exactly should happen these vars are set through
—env?

Perhaps someone can give me pointers in what might be worth trying

Bests & thanks

Florian
Mime
View raw message