predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Clifford" <clifford.mil...@phoenix-opsgroup.com>
Subject Re: PredictionIO with remote Spark and Elasticsearch
Date Thu, 02 Mar 2017 21:10:53 GMT
I found some old references of folks having the same issue as me.  They
indicated that the AWS Elasticsearch Service only supports HTTP and not
TCP.  If this is true then it means that AWS Elasticsearch has very limited
usefulness.  Has anyone else ran into this?


On Thu, Mar 2, 2017 at 1:26 PM, Miller, Clifford <
clifford.miller@phoenix-opsgroup.com> wrote:

> I'm able run pio train although the pio train -- --master
> spark://your_master_url did not work.  I'm using Spark on Yarn so I was
> able to get pio train -- --master yarn://URL to work after I copied the
> elastic search configuration from my CDH cluster.
>
> I'm still struggling with integrating this with AWS elasticsearch.  Does
> anyone have an example of how this should be configured.
>
> FYI, the EC2 instance that I'm running PredictionIO on can access it from
> the command line: "curl -X GET <AWS Elasticsearch endpoint URL>".
>
>
> On Wed, Mar 1, 2017 at 11:44 AM, Donald Szeto <donald@apache.org> wrote:
>
>> Hi Clifford,
>>
>> To use a remote Spark cluster, use passthrough command line arguments on
>> the CLI, e.g.
>>
>> pio train -- --master spark://your_master_url
>>
>> Anything after a lone -- will be passed to spark-submit verbatim. For
>> more information try "pio help".
>>
>> To use a remote Elasticsearch cluster, please refer to examples in
>> "conf/pio-env.sh" where you could find a variable to set the remote host
>> name or IP of your ES cluster.
>>
>> Regards,
>> Donald
>>
>> On Tue, Feb 28, 2017 at 12:57 PM Miller, Clifford <
>> clifford.miller@phoenix-opsgroup.com> wrote:
>>
>>> I currently have Cloudera cluster (Hadoop, Spark, Hbase...) setup on
>>> AWS.  I have PredictionIO installed on a different EC2 instance.  I've been
>>> able to successfully configure it to use HDFS for model storage and to
>>> store events in Hbase from the cluster.  Spark and Elasticsearch are
>>> installed locally on the PredictionIO EC2 instance.  I have the following
>>> questions:
>>>
>>> How can I configure PredictionIO to utilize the Spark on the Cloudera
>>> cluster?
>>> How can I configure PredictionIO to utilize a remote Elasticsearch
>>> domain?  I'd like to use the AWS Elasticsearch service if possible.
>>>
>>> Thanks
>>>
>>>
>>> --
>>> Clifford Miller
>>> Mobile | 321.431.9089
>>>
>>
>
>
> --
> Clifford Miller
> Mobile | 321.431.9089
>



-- 
Clifford Miller
Mobile | 321.431.9089

Mime
View raw message