mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Which database should I use with Mahout
Date Sun, 19 May 2013 17:20:35 GMT
I'm first saying that you really don't want to use the database as a
data model directly. It is far too slow.
Instead you want to use a data model implementation that reads all of
the data, once, serially, into memory. And in that case, it makes no
difference where the data is being read from, because it is read just
once, serially. A file is just as fine as a fancy database. In fact
it's probably easier and faster.

On Sun, May 19, 2013 at 10:14 AM, Tevfik Aytekin
<> wrote:
> Thanks Sean, but I could not get your answer. Can you please explain it again?
> On Sun, May 19, 2013 at 8:00 PM, Sean Owen <> wrote:
>> It doesn't matter, in the sense that it is never going to be fast
>> enough for real-time at any reasonable scale if actually run off a
>> database directly. One operation results in thousands of queries. It's
>> going to read data into memory anyway and cache it there. So, whatever
>> is easiest for you. The simplest solution is a file.
>> On Sun, May 19, 2013 at 9:52 AM, Ahmet Ylmaz
>> <> wrote:
>>> Hi,
>>> I would like to use Mahout to make recommendations on my web site. Since the
data is going to be big, hopefully, I plan to use hadoop implementations of the recommender
>>> I'm currently storing the data in mysql. Should I continue with it or should
I switch to a nosql database such as mongodb or something else?
>>> Thanks
>>> Ahmet

View raw message