mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott C. Cote" <scottcc...@gmail.com>
Subject Re: get similar items
Date Fri, 14 Feb 2014 17:25:18 GMT
I generate my initial sequence files directly from records in my mysql
database.  Follow Martin's advice on going through the tutorial.  Very
very very helpful.  Also - I really like MiA even if it is a couple of
versions behind.  The clustering chapters are still very accurate (seem to
be :)  ).  

You really need to get a good feel of what kind of vectors you are going
to use as input to your clusters.

SCott

On 2/14/14 1:32 AM, "N!" <12481228@qq.com> wrote:

>Thank you Sebastian&Martin&Scott.
>I checked 
>'https://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+ana
>lysis+using+the+Mahout+command+line'.
>It looks like the case what I said.But I am using JAVA with a Mysql
>database, is there an example related to this?
>
>
>thanks.
>------------------ Original ------------------
>From:  "Scott C. Cote";<scottccote@gmail.com>;
>Date:  Wed, Feb 12, 2014 11:47 PM
>To:  "user@mahout.apache.org"<user@mahout.apache.org>;
>
>Subject:  Re: get similar items
>
>
>
>Since you are relying on unguided data - switch from
>recommenders/classifier to clustering.
>
>Anyone else agree with me on this???
>
>SCott
>
>On 2/12/14 9:04 AM, "Martin, Nick" <NiMartin@pssd.com> wrote:
>
>>Yeah, since it would appear you're lacking requisite data for
>>recommenders the only other thing I can think of in this case is
>>potentially treating the movie records as documents and clustering them
>>(via whatever might be in the 'description' field).
>>
>>Have a look here 
>>https://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+ana
>>l
>>ysis+using+the+Mahout+command+line and see if you can support something
>>like this with your dataset.
>>
>>-----Original Message-----
>>From: Sebastian Schelter [mailto:ssc.open@googlemail.com]
>>Sent: Wednesday, February 12, 2014 6:28 AM
>>To: user@mahout.apache.org
>>Subject: Re: get similar items
>>
>>Hi,
>>
>>Mahout's recommenders are based on analyzing interactions between users
>>and items/movies, e.g. ratings or counts how often the movie was watched.
>>
>>
>>On 02/12/2014 11:34 AM, N! wrote:
>>> Hi all:
>>>   Does anyone have any suggestions for the questions below?
>>>
>>>
>>>   thanks a lot.
>>>
>>>
>>> ------------------ Original ------------------
>>> Sender: "N!"<12481228@qq.com>;
>>> Send time: Wednesday, Feb 12, 2014 6:17 PM
>>> To: "user"<user@mahout.apache.org>;
>>>
>>> Subject: Re: get similar items
>>>
>>>
>>>
>>> Hi Sean:
>>>              Thanks for the reply.
>>>              Assume I have only one table named 'movie' with 1000+
>>>records, this table have three
>>>columns:'id','movieName','movieDescription'.
>>>              Can Mahout calculate the most similar movies for a
>>>movie.(based on only the 'movie' table)?
>>>              code like: List mostSimilarMovieList =
>>>recommender.mostSimilar(int movieId).
>>>              if not, do you have any suggestions for this scenario?
>>>
>>
>
>
>.



Mime
View raw message