mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: User based recommender
Date Tue, 09 Dec 2014 19:12:41 GMT
BTW you may be able to just run the same csv through multiple times and pick a different item-ID
column for each “action”.  BTW here “csv” means a text file with some delimeter, not
the full spec csv with headers, quoted values, and escaped characters.

On Dec 8, 2014, at 4:11 PM, Pat Ferrel <pat@occamsmachete.com> wrote:

No classifier, just turn the one csv into several, each being a collection for one action.

user ID,item ID

Where the item ID is whatever the action corresponds too. For instance a <user ID>,<location
ID> for being at a location or <user ID>,<item ID> for a purchase etc. These
can go directly into the command line of spark-itemsimilarity. --input will always be the
file with purchase, --input2 will be the file with the secondary action. 

On Dec 8, 2014, at 1:22 AM, Yash Patel <yashpatel1230@gmail.com> wrote:

most columns have different values,when you say preprocess do you mean
using classifiers ?

my dataset is highly structured in nature so i dont understand how a
classifier will work.
On Dec 8, 2014 2:20 AM, "Pat Ferrel" <pat@occamsmachete.com> wrote:

> If there is some “filter” column that flags one type of item or another
> then yes. Otherwise you’ll have to preprocess your data for input.
> 
> On Dec 7, 2014, at 2:27 PM, Yash Patel <yashpatel1230@gmail.com> wrote:
> 
> Will cross recommendation still work considering item similarity checks
> multiple columns for items and my dataset has only one column for items;it
> contains different item ids.
> 
> 
> 
> 
> On Sun, Dec 7, 2014 at 5:26 PM, Pat Ferrel <pat@occamsmachete.com> wrote:
> 
>> To use cross-recommendations with multiple actions you may be able to get
>> away with using the pre-packaged command line job “spark-itemsimilarity".
>> At one point you said you were more interested in the Mahout Hadoop
>> Mapreduce recommender, which cannot create these cross-recommendations.
>> 
>> I don’t see any need to use the interactive Mahout or Spark shell.
> Calling
>> Scala from Java is pretty complex so I’d recommend starting from the
>> running driver so you have a base of Scala code to start from. Calling
> Java
>> from Scala is dead simple, it’s done throughout Mahout code. This should
>> help make Scala a little less daunting. I use IntelliJ and there should
> be
>> no problem using Eclipse in the same manner.
>> 
>> 
>> On Dec 6, 2014, at 3:55 PM, Yash Patel <yashpatel1230@gmail.com> wrote:
>> 
>> i have something that shows the user locations,however is it possible to
>> implement this without using apache spark shell as i found it quite
>> confusing to use without no examples.
>> 
>> I have a windows environment and i am using java in eclipse luna to code
>> the recommender.
>> On Dec 6, 2014 9:09 PM, "Pat Ferrel" <pat@occamsmachete.com> wrote:
>> 
>>> You can often think of or re-phase a piece of data (a column in your
>>> interaction data) as an action, like “being at a location”. Then use
>>> cross-cooccurrence to calculate a cross-indicator. So the location can
> be
>>> used to recommend purchases.
>>> 
>>> If you do this, the location should be something that can have
>>> cooccurrence, so instead of lat-lon some part of an address. Maybe
>>> country+postal-code would be good. Something unique that identifies a
>>> location where other users can be.
>>> 
>>> 
>>> On Dec 5, 2014, at 11:10 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
>>> 
>>> Cross recommendation can apply if you use the multiple kinds of columns
>> to
>>> impute actions relative to characteristics.  That is, people at this
>>> location buy this item.  Then when you do the actual query, the query
>>> contains detailed history of the person, but also recent location
>> history.
>>> 
>>> 
>>> 
>>> On Thu, Dec 4, 2014 at 7:17 AM, Yash Patel <yashpatel1230@gmail.com>
>>> wrote:
>>> 
>>>> Cross Recommendors dont seem applicable because this dataset doesn't
>>>> represent different actions by a user,it just contains transaction
>>>> history.(ie.customer id,item id,shipping location,sales amount of that
>>>> item,item category etc)
>>>> 
>>>> Maybe location,sales per item(similarity might lead to knowledge of
>>> people
>>>> who share same purchasing patterns) etc.
>>>> 
>>>> 
>>>> On Wed, Dec 3, 2014 at 5:28 PM, Ted Dunning <ted.dunning@gmail.com>
>>> wrote:
>>>> 
>>>>> On Wed, Dec 3, 2014 at 6:22 AM, Yash Patel <yashpatel1230@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> I have multiple different columns such as category,shipping
>>>> location,item
>>>>>> price,online user, etc.
>>>>>> 
>>>>>> How can i use all these different columns and improve recommendation
>>>>>> quality(ie.calculate more precise similarity between users by use
of
>>>>>> location,item price) ?
>>>>>> 
>>>>> 
>>>>> For some kinds of information, you can build cross recommenders off of
>>>> that
>>>>> other information.  That incorporates this other information in an
>>>>> item-based system.
>>>>> 
>>>>> Simply hand coding a similarity usually doesn't work well.  The
> problem
>>>> is
>>>>> that you don't really know which factors really represent actionable
>> and
>>>>> non-redundant user similarity.
>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 



Mime
View raw message