Yes, I am planning on implementing something along the lines of (3). As of now, the data will be persisting in a .txt file, but will prob move to mySQL in the near future. Essentially, I would like to update/refresh the datamodel contained in memory during each successive HTTP request and only upload/refresh from the file periodically, e.g. rebooting the web service. On Mon, Apr 20, 2009 at 10:17 PM, Sean Owen wrote: > So there are two important components here, your Recommender and your > DataModel. > > > DataModels should always have the most up-to-date data about your > domain -- they don't cache or anything (well... FileDataModel reads > into memory because it is just not efficient to seek through a file > for data every time). So yes you want new information to immediately > update the DataModel if you can. > > Yes calling refresh(null) will cause FileDataModel to reload all the > data from the file. I agree it does not sound efficient to just use > this, but let me make a couple related points: > > 1) You can push updates to the file without re-pushing the whole file. > If your main data file is /foo/data.txt.gz, you can push a file like > /foo/data.update1.txt.gz next to it, and that data will be read after > the main file and override what is in the main file. However, it is > still not efficient to push a small file and reload on every update. I > would consider this only if you are willing to batch updates and push > them periodically instead. > > 2) You probably want to persist the data you are receiving, maybe in a > database? if the data already exists in a database, you can use > something like MySQLJDBCDataModel instead to read from there instead > of a file. > > 3) Or, I imagine you are persisting this data somehow, maybe not in a > database. You can always write a custom DataModel based on that, again > rather than also updating a file. If you are considering updating the > data structures you see in FileDataModel -- I think you are going down > this road. I might suggest you just copy-and-paste it and toss the > parts you don't want, add logic you need. > > 4) Right now FileDataModel.{set,remove}Preference() throws an > exception since these are not supported -- the implementation is > read-only. I could change this to make this methods update the > in-memory representation -- but it would not change the underlying > file, and any such updates would be lost on the next reload. Still if > it helps meet your needs I can make that change. > > > The Recommender on the other hand, I would not refresh on every > request -- certainly not slope-one, as this algorithm needs a lot of > preprocessing. Instead I would refresh it periodically -- once an > hour, day -- whatever meets your performance / freshness goals. > > > On Tue, Apr 21, 2009 at 6:03 AM, Matthew Roberson > wrote: > > btw using FileDataModel and RecommenderServlet to run slope one > Recommender > > as web service. > > > > I wanted to update the datamodel for each http request as more user > > preference data will be generated with each new request. > > > > Would this be handled by a call to the reload() function contained within > > FileDataModel.java for each HTTP request? > > > > It appears that this is the function call that begins the process of > > updating the datamodel via a call to processfile(). > > > > Also, the reload function requires all ratings files to be uploaded to > > update the datamodel. I assume this is done because the Map containing > users > > and their preferences is local only to the processfile() method. I was > > planning on making this Map global so that I can update the datamodel > > without having to upload all the ratings files. Do you see any pitfalls > in > > this plan? >