Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 01B59200CA3 for ; Thu, 1 Jun 2017 20:19:42 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 00522160BC4; Thu, 1 Jun 2017 18:19:42 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1FAD7160BC1 for ; Thu, 1 Jun 2017 20:19:40 +0200 (CEST) Received: (qmail 77476 invoked by uid 500); 1 Jun 2017 18:19:40 -0000 Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@predictionio.incubator.apache.org Delivered-To: mailing list user@predictionio.incubator.apache.org Received: (qmail 77466 invoked by uid 99); 1 Jun 2017 18:19:40 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Jun 2017 18:19:40 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id CD81CC194D for ; Thu, 1 Jun 2017 18:19:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.38 X-Spam-Level: ** X-Spam-Status: No, score=2.38 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id uQi7oNAytofl for ; Thu, 1 Jun 2017 18:19:37 +0000 (UTC) Received: from mail-qk0-f170.google.com (mail-qk0-f170.google.com [209.85.220.170]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 005425F3FE for ; Thu, 1 Jun 2017 18:19:36 +0000 (UTC) Received: by mail-qk0-f170.google.com with SMTP id y201so43071694qka.0 for ; Thu, 01 Jun 2017 11:19:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=nI4M9FhO+y6X37uLcqFMA/LmKIcE/05SzH0i70O+gx0=; b=fqo4sUK6+ZUoDiQwAIXGE3wWPhLACXMxgzBjjdYx3WLZyCVcPSKL7JCpIUe1BO4WDq Irjw3p5KRnaVb6uSEergmEWgIpOrelng3GTcTj55Hcl1GgmwxL6BR59xQZKLEJrmg5Hm pROzheXmyewAIg8H7VUVZ0YHqBW2ehK1l8b+hNROAIJun1BcYo3CCEo5HK6GaHKjrIpt c1zC6HorqJsMu2a6PI+7kEtWpRG0+lnaWsIf1FcqjFaWwRm0zkPr20sh+9Jw5cFwdsC8 JBPL28PFMIK652u73ivHFFOlYwhtCoSV3SKwtLmPIwhyX0tD6Qu8Rl+Zl7/Phzrnu98K ThqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=nI4M9FhO+y6X37uLcqFMA/LmKIcE/05SzH0i70O+gx0=; b=NDoo6FBwI/KfoIyGGdJTU9gnNSyHwoBY1d8wYLwBuYIQ5/2WiQ2Y1j6uB/0ugdjvdA wzNLjmmoMgL1vc3RcdXNKctDp1pwQHX8oR1u0K1tszA1PBnS0xPeX8W2qr/McI8HtnN/ tEcCT2uC25iL2V+VlZsAHzGUO2VLEZHuNoEam5MWXeG1mf1R2JQp/ikoXYwUbNzzVudU naou1T99nonyEW/XK4XSCBCvY4/TfuOKwcefmwMLk3WtVHNsTFx5R8Map4h/xIhmvY1b XXlx5iYXqPfvuLECd1jifWaV54KMWC5ZxeGHaNf25eaX7fy41TCNNDf1aDqUmWgaqFFt EB7Q== X-Gm-Message-State: AKS2vOwOlDb7DTy1yxmY4eZhS59X2pPpzg+0GhmSy4uxTMgEeKs4t2K3 AlnUQz9qo+CyOUfFDfjIhy8dXaLSSg== X-Received: by 10.55.221.93 with SMTP id n90mr3248309qki.215.1496341175823; Thu, 01 Jun 2017 11:19:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.237.36.61 with HTTP; Thu, 1 Jun 2017 11:19:35 -0700 (PDT) In-Reply-To: References: From: Martin Fernandez Date: Thu, 1 Jun 2017 15:19:35 -0300 Message-ID: Subject: Re: Disable hbase user history queries To: user@predictionio.incubator.apache.org Cc: actionml-user Content-Type: multipart/alternative; boundary="001a1149e68a3f71530550ea14d6" archived-at: Thu, 01 Jun 2017 18:19:42 -0000 --001a1149e68a3f71530550ea14d6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks Pat for your reply. I am doing Video on Demand e-commerce in which reatime query would be very helpful but I want to minimize the risks of HDFS synchronization latency between datacenters. Do you have experience running predictionIO + Universal Recommender in multiple DCs that you can share? Did you face any latency issue with the HBASE cluster? Thanks in advance On Thu, Jun 1, 2017 at 2:53 PM, Pat Ferrel wrote: > First, I=E2=80=99m not sure this is a good idea. You loose the realtime n= ature of > recommendations based on the up-to-the-second recording of user behavior. > You get this with live user event input even without re-calculating the > model in realtime. > > Second, no you can=E2=80=99t disable queries for user history, it is the = single > most important key to personalized recommendations. > > I=E2=80=99d have to know more about your application but the first line o= f cost > cutting for us in custom installations (I work for ActionML the maintaine= r > of the UR Template) is to make the Spark cluster temporary since it is no= t > needed to serve queries and only needs to run during training. We start i= t > up, train. then shut it down. > > If you really want to shut the entire system down and don=E2=80=99t want = realtime > user behavior you can query for all users and put the results in your DB = or > in-memory cache like a hashmap, then just serve from your db or in-memory > cache. This takes you back to the days of the old Mahout Mapreduce > recommenders (pre 2014) but maybe it fits your app. > > If you are doing E-Commerce think about a user=E2=80=99s shopping behavio= r. They > shop, browse, then buy. Once they buy that old shopping behavior is no > longer indicative of realtime intent. If you miss using that behavior you > may miss the shopping session altogether. But again, your needs may vary. > > > On Jun 1, 2017, at 6:19 AM, Martin Fernandez > wrote: > > Hello guys, > > we are trying to deploy Universal Recommender + predictionIO in our > infrastructure but we don't want to distribute hbase accross datacenters > cause of the latency. So the idea is to build and train the engine offlin= e > and then copy the model and ealstic data to PIO replicas. I noticed when = I > deploy engine, it always tries to connect to HBASE server since it is use= d > to query user history. Is there any way to disable those user history > queries and avoid connection to HBASE? > > Thanks > > Martin > > --=20 Saludos / Best Regards, *Martin Gustavo Fernandez* Mobile: +5491132837292 --001a1149e68a3f71530550ea14d6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks Pat for your reply. I am doing Video on Demand e-co= mmerce in which reatime query would be very helpful but I want to minimize = the risks of HDFS synchronization latency between datacenters. Do you have = experience running predictionIO + Universal Recommender in multiple DCs tha= t you can share? Did you face any latency issue with the HBASE cluster? =C2= =A0

Thanks in advance

On Thu, Jun 1, 2017 at 2:53 PM, Pat Ferr= el <pat@occamsmachete.com> wrote:
First, I=E2=80=99m not sure this is a good idea. You loose th= e realtime nature of recommendations based on the up-to-the-second recordin= g of user behavior. You get this with live user event input even without re= -calculating the model in realtime.

Second, no you can=E2=80=99t disable queries for user history, it is the si= ngle most important key to personalized recommendations.

I=E2=80=99d have to know more about your application but the first line of = cost cutting for us in custom installations (I work for ActionML the mainta= iner of the UR Template) is to make the Spark cluster temporary since it is= not needed to serve queries and only needs to run during training. We star= t it up, train. then shut it down.

If you really want to shut the entire system down and don=E2=80=99t want re= altime user behavior you can query for all users and put the results in you= r DB or in-memory cache like a hashmap, then just serve from your db or in-= memory cache. This takes you back to the days of the old Mahout Mapreduce r= ecommenders (pre 2014) but maybe it fits your app.

If you are doing E-Commerce think about a user=E2=80=99s shopping behavior.= They shop, browse, then buy. Once they buy that old shopping behavior is n= o longer indicative of realtime intent. If you miss using that behavior you= may miss the shopping session altogether. But again, your needs may vary.<= br>


On Jun 1, 2017, at 6:19 AM, Martin Fernandez <martingfernandez@gmail.com> wrote:

Hello guys,

we are trying to deploy Universal Recommender + predictionIO in our infrast= ructure but we don't want to distribute hbase accross datacenters cause= of the latency. So the idea is to build and train the engine offline and t= hen copy the model and ealstic data to PIO replicas. I noticed when I deplo= y engine, it always tries to connect to HBASE server since it is used to qu= ery user history. Is there any way to disable those user history queries an= d avoid connection to HBASE?

Thanks

Martin




--
=
=
Saludos / Best Regards,

Martin Gusta= vo Fernandez
Mobile:=C2=A0+5491132837292

--001a1149e68a3f71530550ea14d6--