In general, if you want real-time recommendations, you want the data in
memory. Otherwise it's too slow. The JDBC-backed model works for, roughly,
small problems up to a couple million ratings. Beyond that, stick it in
memory. (And past about 100M ratings, you need to consider distributing the
computation.)
FileDataModel is also in-memory since it uses GenericDataModel inside. It's
built for fine-grained updates with its "delta files" support.
On Wed, Oct 6, 2010 at 3:57 PM, James James <recommendersystem@yahoo.com>wrote:
> Hi,
>
> I was evaluating which DataModel should be used when we are dealing with a
> large
> amount of user data with new data coming in on a regular basis (for example
> on a
> daily basis). The GenericModel is immutable, which requires the user data
> to be
> reloaded when new data comes in. I have not tried JDBCDataModel yet. Based
> on
> the posts here, it seems to me the reloading is not needed for
> JDBCDataModel
> since it is always kept up-to-date.
>
>
> Do you think that JDBCDataModel is more efficient for my case? Are there
> any
> implementations of DataModel using HBase?
>
> Thanks,
>
> James
>
>
>
>
|