predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mattz <>
Subject Re: Batch recommender
Date Mon, 31 Jul 2017 04:12:55 GMT
Pat - thanks for the detailed write-up.

On Fri, Jul 28, 2017 at 11:00 PM, Pat Ferrel <> wrote:

> As it happens I just looked into the concurrency issue. How many
> connections can be made to the Prediction Server. The answer is that Spray
> HTTP, the earlier version of what is now merged with akka-http, uses
> something called tcp back-pressure to optimize the number of connections
> allowed and the number of simultaneous responses that can be worked on in
> one connection. This accounts for how many cores and therefore threads are
> available on the machine. The upshot is that Spray self-optimizes for the
> max that the machine can handle.
> This means that connections can be raised by increasing cores but if you
> are not at 100% CPU usage then some other part of the system is probably
> the bottleneck and that means batch queries will not help. Only if you see
> 100% CPU usage on the prediction server is increasing connections or batch
> queries going to help.
> Remember that for evaluation you are putting the worst case load on the
> Prediction Server, worse than any real-world scenario is likely to hit. And
> it is almost certain that you will be able to overload the system since
> sending a query is much much easier than processing it. So processing the
> query is the most likely cause of speed limits.
> Therefore, scale the system to the max time you can wait for all
> responses. Start with the Prediction Server, either add cores or put a load
> balancer in so you can have any number of PS machines. Scale HBase and
> Elasticsearch in the normal ways using their clustering methods as you see
> them start to show max CPU usage. Nothing is stored during a query but disk
> may be read so also watch I/O for bottlenecks. This assumes you have a
> clustered deployment and can measure each systems’s load independently. If
> you are all on one machine, good luck because there are many
> over-constrained situations where one service grabs resources another needs
> causing artificial bottlenecks.
> Assuming a clustered environment the system is indefinitely scalable
> because no state is stored in PIO, only in scalable services like HBase and
> Elasticsearch. There are 2 internal queries for every query you make, one
> to HBase, and one to ES. Both have been use in massive deployments and so
> can handle any load you can define.
> So by scaling the PS (vertically or with load balancers) and Hbase and ES
> (through vertical or cluster expansion) you should be able to handle as
> many queries per second as you need.
> On Jul 27, 2017, at 9:39 PM, Mattz <> wrote:
> Thanks Mars. Looks like the pio eval may not work for my needs (according
> to Pat) since I am using UR template.
> And, querying the RESP API in bulk may be limited with the concurrency
> that the API can handle. I tested with a reasonably sized machine and this
> number was not high enough.
> On Thu, Jul 27, 2017 at 11:08 PM, Mars Hall <> wrote:
>> Hi Mattz,
>> Yes, that batch evaluator using `pio eval` is currently the only
>> documented way to run batch predictions.
>> It's also possible to create a custom script that calls the Queries
>> HTTP/REST API to collect predictions in bulk.
>> My team has had this need reoccur. So, I implemented a `pio batchpredict`
>> command for PredictionIO, but it's not yet been merged & released. See the
>> pull request:
>> *Mars
>> ( <> .. <> )
>> > On Jul 27, 2017, at 05:25, Mattz <> wrote:
>> >
>> > Hello,
>> >
>> > I am using the "Universal Recommender" template. Is the below guide
>> current if I want to create bulk recommendations in batch?
>> >
>> >
>> recommendation/batch-evaluator/
>> >
>> > Thanks!

View raw message