predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Error while pio status
Date Mon, 03 Apr 2017 15:11:08 GMT
The data will come from HBase (or possibly JDBC but not recommended) the model is always stored
in Elasticsearch. The reason for storage in Elasticsearch is that the last step in the algorithm
is performed by the ES query, with gives k-nearest neighbors based on cosine similarity. This
is not possible with HDFS. We are not fetching things by ID, we are performing a mathematical
operation on the model that fetches special things.

HDFS may be used for import/export but is not needed by the UR explicitly.

If you are using the setup instructions on actionml.com I suggest you look through that again.
It looks like you have tried things that were outside of those instructions.


#!/usr/bin/env bash

# PredictionIO Main Configuration
#
# This section controls core behavior of PredictionIO. It is very likely that
# you need to change these to fit your site.

# Safe config that will work if you expand your cluster later
SPARK_HOME=/usr/local/spark
ES_CONF_DIR=/usr/local/elasticsearch
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
HBASE_CONF_DIR=/usr/local/hbase/conf

# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=$HOME/.pio_store
PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

# PredictionIO Storage Configuration
#
# This section controls programs that make use of PredictionIO's built-in
# storage facilities.

# Storage Repositories

PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH


PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

# Need to use HDFS here instead of LOCALFS to account for future expansion
PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
# PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=HDFS
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE= ELASTICSEARCH


# Storage Data Sources, lower level that repos above, just a simple storage API
# to use

# Elasticsearch Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/usr/local/elasticsearch
# the next line should match the cluster.name in elasticsearch.yml
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=infoquest

# For single host Elasticsearch, may add hosts and ports later
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=some-master
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=some-master  <—— put your DNS name or IP address
for ES here
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300

# dummy models are stored here so use HDFS in case you later want to
# expand the Event and PredictionServers
PIO_STORAGE_SOURCES_HDFS_TYPE=hdfs
PIO_STORAGE_SOURCES_HDFS_PATH=hdfs://some-master:9000/models

# HBase Source config
PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOME=/usr/local/hbase
# Hbase single master config
# PIO_STORAGE_SOURCES_HBASE_HOSTS=some-master
PIO_STORAGE_SOURCES_HBASE_HOSTS=some-master   <—— put your DNS name or IP address for
HBase here
PIO_STORAGE_SOURCES_HBASE_PORTS=0

# I don’t think this is used
PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
PIO_STORAGE_SOURCES_FS_PATH=/mymodels <—— really? /mymodels at the root of the local
disk?




On Apr 3, 2017, at 7:01 AM, infoquest india <infoquestindia@gmail.com> wrote:

Can we use HDFS or LocalFileSystem for UR ?

I am using single machine setup and changed my /etc/hosts file to point to internal IP.

Please find attached pio-env,sh.

One thing i am not clear what is creating issue HDFS or ElasticSearch ?


Thanks
Gaurav
http://www.infoquestsolutions.com <http://www.infoquestsolutions.com/>
Turning Imagination To Reality
Skype:- infoquestsolutions
Gtalk:- infoquestindia

On Mon, Apr 3, 2017 at 6:52 PM, Pat Ferrel <pat@occamsmachete.com <mailto:pat@occamsmachete.com>>
wrote:
If you are still using the UR you don’t need HDFS as a storage backend.

In setup instructions, “some-master” is a placeholder where you actually enter the DNS
name or IP address of your actual master machine running Elasticsearch. This can be a list
comma separated, no spaces.

Can you share your pio-env.sh


On Apr 3, 2017, at 4:31 AM, infoquest india <infoquestindia@gmail.com <mailto:infoquestindia@gmail.com>>
wrote:

Hi 

I am using pio status i am getting error 


SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/aml/pio/PredictionIO-0.11.0-SNAPSHOT/lib/spark/pio-data-hdfs-assembly-0.11.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/aml/pio/PredictionIO-0.11.0-SNAPSHOT/lib/pio-assembly-0.11.0-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings <http://www.slf4j.org/codes.html#multiple_bindings>
for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

[INFO] [Management$] Inspecting PredictionIO...

[INFO] [Management$] PredictionIO 0.11.0-SNAPSHOT is installed at /home/aml/pio/PredictionIO-0.11.0-SNAPSHOT

[INFO] [Management$] Inspecting Apache Spark...

[INFO] [Management$] Apache Spark is installed at None

[INFO] [Management$] Apache Spark 1.6.3 detected (meets minimum requirement of 1.3.0)

[INFO] [Management$] Inspecting storage backend connections...

[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...

[INFO] [Storage$] Verifying Model Data Backend (Source: HDFS)...

[ERROR] [Storage$] Error initializing storage client for source HDFS

[ERROR] [Management$] Unable to connect to all storage backends successfully.

The following shows the error message from the storage backend.



Data source HDFS was not properly initialized. (org.apache.predictionio.data.storage.StorageClientException)



Dumping configuration of initialized storage backend sources.

Please make sure they are correct.



Source Name: ELASTICSEARCH; Type: elasticsearch; Configuration: HOME -> /usr/local/elasticsearch,
HOSTS -> some-master, PORTS -> 9300, CLUSTERNAME -> infoquest, TYPE -> elasticsearch

Source Name: HDFS; Type: (error); Configuration: (error)



Thanks
Gaurav



<pio-env.sh.rtf>


Mime
View raw message