mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mirko Gontek <mirko.gon...@uni-koeln.de>
Subject Re: FileDataModel throws UnsupportedOperationException
Date Wed, 22 Apr 2009 14:29:22 GMT
Hi Sean,
when you say that the FileDataModel originally was intended to be read- 
only I get the impression that I am on the wrong track. Maybe you  
could comment on my thoughts, this would be great help...

I would like to implement a GenericItemBasedRecommender, my testdata  
is a DB with 300.000 Preferences (130.000 items, 12.000 users).

1) I implement a DataModel that initially loads all data from the DB  
into memory and works with the data in memory from that point on. My  
DataModel implementation only accesses (read/write) the DB on refresh().

2) For the recommender to be fast, I need pre-computed  
ItemItemSimilarities. Thus, I implement ItemSimilarity. My  
implementation keeps all ItemItemSimilarities in memory, until  
refresh(). Like above, my ItemSimilarity implementation only accesses  
(read/write) the DB on refresh().

3) Since I don't have a good method to calculate item similarities  
yet, I want to use the following to generate itemSimilarities once:
MyItemSimilarityImpl itemSimilarity = new GenericItemSimilarity(new  
PearsonCorrelationSimilarity(dataModel), dataModel, maxToKeep);

My question is: is it good practice to keep all data in memory until  
refresh? I mean, memory is of course limited, so memory-based  
DataModel (or ItemSimilarity) implementations are limited, right? (For  
this reason I looked to FileDataModel).

Best, Mirko


Am 22.04.2009 um 11:27 schrieb Sean Owen:

> Ha it is really funny that you ask. Yes, this *was* intended behavior.
> FileDataModel and GenericDataModel were read-only, so the behavior you
> saw was intended. It is mostly for performance, and because
> FileDataModel gets updates from the file, not from the caller.
>
> But just yesterday I changed these methods to work. You can call
> setPreference() and removePreference() now. Be warned though that the
> methods are not thread-safe -- you need to synchronize. They may  be
> slow. And, FileDataModel does not update the underlying file with your
> change. It only remains temporarily in memory.
>
> If you get the latest version from Subversion you will see this.
>
> Sean
>
> On Wed, Apr 22, 2009 at 9:59 AM, Mirko Gontek <mirko.gontek@uni-koeln.de 
> > wrote:
>> Hi,
>> I experiment with Taste's FileDataModel to get a simple file-based  
>> example
>> running. Getter methods are working, but getPreference() and
>> removePreference() throw java.lang.UnsupportedOperationExceptions.  
>> Do I get
>> something wrong here? Here is my test code:
>>
>> File f = new File("input/data.csv");
>> logger.debug(f.canWrite()); // true
>> DataModel model = new FileDataModel(f);
>> logger.debug("items "+ model.getNumItems()); // 3
>> logger.debug("users "+ model.getNumUsers()); // 3
>> Object[] prefs = model.getPreferencesForItemAsArray("evolution");
>> logger.debug("prefs for evolution: "+prefs.length); // 1
>>
>> model.setPreference("tom", "physics", new Double(0.1)); // THROWS  
>> EXCEPTION
>> // model.removePreference("tom","evolution"); // THROWS EXCEPTION
>>
>> java.lang.UnsupportedOperationException
>>        at
>> org 
>> .apache 
>> .mahout 
>> .cf 
>> .taste 
>> .impl.model.file.FileDataModel.setPreference(FileDataModel.java:322)
>>
>> The content of input/data.csv is:
>>
>> tom,evolution,1
>> anna,human,0
>> tim,biology,1
>>
>> Thanks for your help! Mirko
>>
>>


Mime
View raw message