mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hlqv <hlqvu...@gmail.com>
Subject Re: spark-itemsimilarity out of memory problem
Date Tue, 23 Dec 2014 17:18:55 GMT
Thank you for your explanation

There is a situation that I'm not clear, I have the result of item
similarity

iphone    nexus:1 ipad:10
surface   nexus:10 ipad:1 galaxy:1

Omit LLR weights then
If a user A has the purchase history : 'nexus', which one the
recommendation engine should prefer - 'iphone' or 'surface'
If a user B has the purchase history: 'ipad', 'galaxy' then I think the
recommendation engine should recommend 'iphone' instead of 'surface' (if
apply TF-IDF weight then the recommendation engine will return 'surface')

I really don't know whether my understanding here has some mistake

On 23 December 2014 at 23:14, Pat Ferrel <pat@occamsmachete.com> wrote:

> Why do you say it will lead to less accuracy?
>
> The weights are LLR weights and they are used to filter and downsample the
> indicator matrix. Once the downsampling is done they are not needed. When
> you index the indicators in a search engine they will get TF-IDF weights
> and this is a good effect. It will downweight very popular items which hold
> little value as an indicator of user’s taste.
>
> On Dec 23, 2014, at 1:17 AM, hlqv <hlqvuong@gmail.com> wrote:
>
> Hi Pat Ferrel
> Use option --omitStrength to output indexable data but this lead to less
> accuracy while querying due to omit similar values between items.
> Whether can put these values in order to improve accuracy in a search
> engine
>
> On 23 December 2014 at 02:17, Pat Ferrel <pat@occamsmachete.com> wrote:
>
> > Also Ted has an ebook you can download:
> > mapr.com/practical-machine-learning
> >
> > On Dec 22, 2014, at 10:52 AM, Pat Ferrel <pat@occamsmachete.com> wrote:
> >
> > Hi Hani,
> >
> > I recently read about Souq.com. A vey promising project.
> >
> > If you are looking at the spark-itemsimilarity for ecommerce type
> > recommendations you may be interested in some slide decs and blog posts
> > I’ve done on the subject.
> > Check out:
> >
> >
> http://occamsmachete.com/ml/2014/10/07/creating-a-unified-recommender-with-mahout-and-a-search-engine/
> >
> >
> http://occamsmachete.com/ml/2014/08/11/mahout-on-spark-whats-new-in-recommenders/
> >
> >
> http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/
> >
> > Also I put up a demo site that uses some of these techniques:
> > https://guide.finderbots.com
> >
> > Good luck,
> > Pat
> >
> > On Dec 21, 2014, at 11:44 PM, AlShater, Hani <halshater@souq.com> wrote:
> >
> > Hi All,
> >
> > I am trying to use spark-itemsimilarity on 160M user interactions
> dataset.
> > The job launches and running successfully for small data 1M action.
> > However, when trying for the larger dataset, some spark stages
> continuously
> > fail with out of memory exception.
> >
> > I tried to change the spark.storage.memoryFraction from spark default
> > configuration, but I face the same issue again. How could I configure
> spark
> > when using spark-itemsimilarity, or how to overcome this out of memory
> > issue.
> >
> > Can you please advice ?
> >
> > Thanks,
> > Hani.​​
> > ​
> >
> > Hani Al-Shater | Data Science Manager - Souq.com <http://souq.com/>
> > Mob: +962 790471101 | Phone: +962 65821236 | Skype:
> > hani.alshater@outlook.com | halshater@souq.com <lghafri@souq.com> |
> > www.souq.com
> > Nouh Al Romi Street, Building number 8, Amman, Jordan
> >
> > --
> >
> >
> > *Download free Souq.com <http://souq.com/> mobile apps for iPhone
> > <https://itunes.apple.com/us/app/id675000850>, iPad
> > <https://itunes.apple.com/ae/app/souq.com/id941561129?mt=8>, Android
> > <https://play.google.com/store/apps/details?id=com.souq.app> or Windows
> > Phone
> > <
> >
> http://www.windowsphone.com/en-gb/store/app/souq/63803e57-4aae-42c7-80e0-f9e60e33b1bc
> >
> > **and never
> > miss a deal! *
> >
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message