predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Setup PredictionIO for large events
Date Tue, 06 Sep 2016 18:34:29 GMT
If the question is about training time post them to the UR support forum here: https://groups.google.com/forum/#!forum/actionml-user
<https://groups.google.com/forum/#!forum/actionml-user>

The best way to answer this, as I said below is to capture the Spark GUI timeline output.
It will show down to the task how long things are taking. There are several things that can
be bottlenecks like reading from HBase, writing to ES, the math itself. The timeline will
have all these broken out.

Spark does not persist the logs after the job is done so you will need to tell it to persist
in the job params in order to examine them after completion. 

There are many way to set log persistence so you can google that. To use the PIO CLI set spark.eventLog.enabled
to true:

    pio train -- --conf spark.eventLog.enabled=true

notice that -- is a separator so put any pio options before it in the command line, everything
after is passed raw to SparkSubmit so see that for further options.

You will find the timeline by clicking the finished job on the from page of the GUI then expanding
the “timeline” link at the top of the job page.


On Sep 5, 2016, at 11:08 PM, Digambar Bhat <digambarbhat14@gmail.com> wrote:

Thanks Tom for reply.

I checked no. of Cores. There two CPUs with 10 cores of each. Also virtualization is enabled
so we get 40 CPUs in total. And number of regions for app table is 2.

So may I know how to increase regions for app table?


On 06-Sep-2016 10:40 am, "Tom Chan" <yukhei.chan@gmail.com <mailto:yukhei.chan@gmail.com>>
wrote:
One quick thing to check is the number of regions in the HBase table for your app. If it's
less than the number of cores you have then you won't be utilizing all computing power. Hope
this helps.

Tom


On Sep 5, 2016 9:05 PM, "Digambar Bhat" <digambarbhat14@gmail.com <mailto:digambarbhat14@gmail.com>>
wrote:
Update please..


On 30-Aug-2016 8:06 pm, "Digambar Bhat" <digambarbhat14@gmail.com <mailto:digambarbhat14@gmail.com>>
wrote:
I am using Universal Recommender.


On 30-Aug-2016 8:05 pm, "Pat Ferrel" <pat@occamsmachete.com <mailto:pat@occamsmachete.com>>
wrote:
Training time is also template dependent, what template are you using?

On Aug 30, 2016, at 12:21 AM, Digambar Bhat <digambarbhat14@gmail.com <mailto:digambarbhat14@gmail.com>>
wrote:

Hello,

I am using PredictionIO since last one  year. It's working fine for me.

Earlier importing, training was working flawlessly. But now training is very slow as events
are increased. Training almost taking 9-10 hours.

Currently, events are about 15 million and items are about 10 million.

Architecture is like below:
Spark and elastic search is on two machines. Hadoop and hbase is on another two separate machines.

Each machine has following configuration:
160GB ram, CPUs 40, Cores per socket 10, cpu MHz 3000

So please let me know what is right configuration for such large events. Also let me know
what possibility should I consider as my events are going to increase to billion. Will it
work for such large data set?

Thanks in advance.

Thanks,
Digambar




Mime
View raw message