mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: DataModel
Date Wed, 06 Oct 2010 16:09:01 GMT
In general, if you want real-time recommendations, you want the data in
memory. Otherwise it's too slow. The JDBC-backed model works for, roughly,
small problems up to a couple million ratings. Beyond that, stick it in
memory. (And past about 100M ratings, you need to consider distributing the
computation.)

FileDataModel is also in-memory since it uses GenericDataModel inside. It's
built for fine-grained updates  with its "delta files" support.



On Wed, Oct 6, 2010 at 3:57 PM, James James <recommendersystem@yahoo.com>wrote:

> Hi,
>
> I was evaluating which DataModel should be used when we are dealing with a
> large
> amount of user data with new data coming in on a regular basis (for example
> on a
> daily basis). The GenericModel is immutable, which requires the user data
> to be
> reloaded when new data comes in. I have not tried JDBCDataModel yet. Based
> on
> the posts here, it seems to me the reloading is not needed for
> JDBCDataModel
> since it is always kept up-to-date.
>
>
> Do you think that JDBCDataModel is more efficient for my case? Are there
> any
> implementations of  DataModel using HBase?
>
> Thanks,
>
> James
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message