mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Regarding Online Recommenders
Date Tue, 16 Jul 2013 18:45:52 GMT
Netflix is a small dataset.  12G for that seems quite excessive.

Note also that this is before you have done any work.

Ideally, 100million observations should take << 1GB.

On Tue, Jul 16, 2013 at 8:19 AM, Peng Cheng <> wrote:

> The second idea is indeed splendid, we should separate time-complexity
> first and space-complexity first implementation. What I'm not quite sure,
> is that if we really need to create two interfaces instead of one.
> Personally, I think 12G heap space is not that high right? Most new laptop
> can already handle that (emphasis on laptop). And if we replace hash map
> (the culprit of high memory consumption) with list/linkedList, it would
> simply degrade time complexity for a linear search to O(n), not too bad
> either. The current DataModel is a result of careful thoughts and has
> underwent extensive test, it is easier to expand on top of it instead of
> subverting it.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message