mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Panisson <panis...@di.unito.it>
Subject Re: Considering removing User/Item abstractions
Date Wed, 22 Apr 2009 12:05:32 GMT
I think removing the User and Item abstractions would be a good idea.  
The User interface is a bit more complex with the getPreferences  
methods, but I think it can be easily ported to the DataModel. There  
will be some impact in the already written code, but I think the  
benefits are interesting.
I dont know if removing the Preference abstraction will bring a better  
performance. The getPreferences methods are very useful to iterate  
over the preferences of users and items, and I think it save a lot of  
lookups if the association user/item is present in a single object.

André

Citando Sean Owen <srowen@gmail.com>:

> I am considering a somewhat large change to org.apache.mahout.cf.taste code
> and would like to solicit feedback from users.
>
> The change would be to remove the User, Item and Preference
> interfaces/abstractions from the code. Everything would proceed in terms of
> user and item IDs, and preference values instead.
>
> The reasons for these interfaces originally were, well, it seemed nice. It
> also provided a way for implementors to substitute domain-specific
> implementations with additional information.
>
> But there are problems too.
>
> - Do methods take a User, or user ID? The code is not consistent in this
> regard. If User, the caller is forced to look up a User if it only has an
> ID. (Conversely, if the caller already has a User, and the method needs a
> User, then passing an ID only forces a redundant lookup. I think this is
> rarer.)
>
> - Factory method problem. There are many points in the code where it should
> call to factory methods to generate a User/Item/Preference object since the
> domain may use specialized implementations instead of GenericUser, etc. At
> the moment some methods just assume GenericUser, etc. Fixing this would be a
> bit hard but would more importantly impact performance I think.
>
> - Object overhead. Holding these extra objects has a cost in memory and
> performance.
>
> The code already really assumes there are nothing but user and item IDs and
> a pref value. So why not make the core reflect this and gain some simplicity
> and speed performance?
>
> I think that domains that need to inject extra information can still do this
> fine without needing custom User, Item implementations.
>
> It is just a thought now. Anybody have more?
>
> Sean
>




Mime
View raw message