mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: FileDataModel question: loading incremental files
Date Mon, 15 Nov 2010 16:42:10 GMT
Yes I think you could make this change -- skip update files unless the
modified date is after the *latest* of all the data file's and update files'
last-modified date. Would you be interested in trying out a change like this
locally to verify it works and posting the patch? I think it's just a few
lines of change.

On Mon, Nov 15, 2010 at 2:31 PM, Jordan, Eric <eric.jordan@navteq.com>wrote:

> Hi,
>
> This is my second time trying to post this - the first time did not seem to
> work; my apologies if this ends up being a duplicate post.
>
> I'm having an issue with FileDataModel.  In particular, suppose you have a
> main data file (say, /tmp/data.lst) and two incremental files (say,
> /tmp/data.1.lst and /tmp/data.2.lst).  Call refresh() on the recommender to
> cause it to read in all three files.  Now, "touch /tmp/data.2.lst" to
> simulate updating the second data file.  Then, call refresh again.  I find
> that the system reads in _both_ /tmp/data.1.lst and /tmp/data.2.lst, when I
> was expecting it to only read in /tmp/data.2.lst.
>
> This would obviously be a performance issue.  Am I doing something wrong?
>  Any help appreciated.
>
> Thanks,
>
> Eric
>
>
>
> The information contained in this communication may be CONFIDENTIAL and is
> intended only for the use of the recipient(s) named above. If you are not
> the intended recipient, you are hereby notified that any dissemination,
> distribution, or copying of this communication, or any of its contents, is
> strictly prohibited. If you have received this communication in error,
> please notify the sender and delete/destroy the original message and any
> copy of it from your computer or paper files.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message