predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Digambar Bhat <>
Subject Setup PredictionIO for large events
Date Tue, 30 Aug 2016 07:21:06 GMT

I am using PredictionIO since last one  year. It's working fine for me.

Earlier importing, training was working flawlessly. But now training is
very slow as events are increased. Training almost taking 9-10 hours.

Currently, events are about 15 million and items are about 10 million.

Architecture is like below:
Spark and elastic search is on two machines. Hadoop and hbase is on another
two separate machines.

Each machine has following configuration:
160GB ram, CPUs 40, Cores per socket 10, cpu MHz 3000

So please let me know what is right configuration for such large events.
Also let me know what possibility should I consider as my events are going
to increase to billion. Will it work for such large data set?

Thanks in advance.


View raw message