mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: MongoDBDataModel in memory ?
Date Sun, 18 Mar 2012 20:54:52 GMT

What is the humongous amount of data in Mongo?  Is it really item->item
links?  Or is it session information?

With a recommender, it is unusual to have more than a few hundred links to
other items for any given item.  This means that even for 10 million items,
you only have about a billion links in total and that can usually fit in
memory on a single machine pretty easily.  Recommenders with 10 million
items are pretty rare and can often be factored down by some content

So, are you sure your data is too large for memory?

On Sun, Mar 18, 2012 at 1:12 PM, Mridul Kapoor <>wrote:

> Hi,
> I am up for building a item based recommender using Mahout. I have
> humongous amount of data in a Mongodb collection. But I am not sure that
> the MongoDBDataModel provided with Mahout will be able to handle my case. I
> see that in the buildModel() function, it creates a
> > FastByIDMap<Collection<Preference>> userIDPrefMap = new
> > FastByIDMap<Collection<Preference>>();
> >
> [line 556]
> Does the subsequent code refer to creating an in-memory-model of the data
> from the mongodb collection(which I think it does); if yes - is there any
> current immediate alternative to that ?
> Thanks
> Mridul

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message