mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Psaltis <Andrew.Psal...@Webtrends.com>
Subject Re: Setting up a recommender
Date Sat, 20 Jul 2013 18:25:49 GMT
I am very interested in collaborating on the off-line to Solr part. Just
let me know how we want to get going.

Thanks,
Andrew





On 7/19/13 4:45 PM, "Ted Dunning" <ted.dunning@gmail.com> wrote:

>OK.  I think the crux here is the off-line to Solr part so let's see who
>else pops up.
>
>Having a solr maven could be very helpful.
>
>
>On Fri, Jul 19, 2013 at 3:39 PM, Luis Carlos Guerrero Covo <
>lcguerrerocovo@gmail.com> wrote:
>
>> I'm currently working for a portal that has a similar use case and I was
>> thinking of implementing this in a similar way. I'm generating
>> recommendations using python scripts based on similarity measures
>>(content
>> based recommendation) only using euclidean distance and some weights for
>> each attribute. I want to use mahout's GenericItemBasedRecommender to
>> generate these same recommendations without user data (no tracking right
>> now of user to item relationship). I was thinking of pushing the
>>generated
>> recommendations to solr using atomic updates since my fields are all
>>stored
>> right now. Since this is very similar to what I'm trying to accomplish,
>>I
>> would sign up to collaborate in any way I can since I'm fairly familiar
>> with solr and I'm starting to learn my way around mahout.
>>
>>
>> On Fri, Jul 19, 2013 at 5:12 PM, Sebastian Schelter <ssc@apache.org>
>> wrote:
>>
>> > I would also be willing to provide guidance and advice for anyone
>>taking
>> > this on, I can especially help with the offline analysis part.
>> >
>> > --sebastian
>> >
>> >
>> > 2013/7/19 Ted Dunning <ted.dunning@gmail.com>
>> >
>> > > I would be happy to supervise a project to implement a demo of this
>>if
>> > > anybody is willing to do the grunt work of gluing things together.
>> > >
>> > > Sooo, if you would like to work on this, here is a suggested
>>project.
>> > >
>> > > This project would entail:
>> > >
>> > > a) build a synthetic data source
>> > >
>> > > b) write scripts to do the off-line analysis
>> > >
>> > > c) write scripts to export to Solr
>> > >
>> > > d) write a very quick web facade over Solr to make it look like a
>> > > recommendation engine.  This would include
>> > >
>> > >   d.1) a "most popular page" that does combined popularity rise and
>> > > recommendation
>> > >
>> > >   d.2) a "personal recommendation page" that does just
>>recommendation
>> > with
>> > > dithering
>> > >
>> > >   d.3) item pages with "related items" at the bottom
>> > >
>> > > e) work with others to provide high quality system walk-through and
>> > install
>> > > directions
>> > >
>> > > If you want to bite on this, we should arrange a weekly video
>>hangout.
>>  I
>> > > am willing to commit to guiding and providing detailed technical
>> > > approaches.  You should be willing to commit to actually doing
>>stuff.
>> > >
>> > > The goal would be to provide a fully worked out scaffolding of a
>> > practical
>> > > recommendation system that presumably would become an example
>>module in
>> > > Mahout.
>> > >
>> > >
>> > > On Fri, Jul 19, 2013 at 1:08 PM, B Lyon <bradflyon@gmail.com> wrote:
>> > >
>> > > > +1 as well.  Sounds fun.
>> > > >
>> > > > On Fri, Jul 19, 2013 at 4:06 PM, Dominik Hübner <
>> contact@dhuebner.com
>> > > > >wrote:
>> > > >
>> > > > > +1 for getting something like that in a future release of Mahout
>> > > > >
>> > > > > On Jul 19, 2013, at 10:02 PM, Sebastian Schelter
>><ssc@apache.org>
>> > > wrote:
>> > > > >
>> > > > > > It would be awesome if we could get a nice, easily deployable
>> > > > > > implementation of that approach into Mahout before 1.0
>> > > > > >
>> > > > > >
>> > > > > > 2013/7/19 Ted Dunning <ted.dunning@gmail.com>
>> > > > > >
>> > > > > >> My current advice is to use Hadoop (if necessary) to
build a
>> > sparse
>> > > > > >> item-item matrix based on each kind of behavior you
have and
>> then
>> > > drop
>> > > > > >> those similarities into a search engine to deliver the
actual
>> > > > > >> recommendations.  This allows lots of flexibility in
terms of
>> > which
>> > > > > kinds
>> > > > > >> of inputs you use for the recommendation and lets you
blend
>> > > > > recommendations
>> > > > > >> with search and geo-location.
>> > > > > >>
>> > > > > >>
>> > > > > >> On Fri, Jul 19, 2013 at 12:33 PM, Helder Martins <
>> > > > > >> helder.garay@corp.terra.com.br> wrote:
>> > > > > >>
>> > > > > >>> Hi,
>> > > > > >>> I'm a dev working for a web portal in Brazil and
I'm
>> particularly
>> > > > > >>> interested in building a item-based collaborative
filtering
>> > > > recommender
>> > > > > >>> for our database of news articles.
>> > > > > >>> After some coding, I was able to get some recommendations
>> using a
>> > > > > >>> GenericItemBasedRecommender, a CassandraDataModel
and some
>> custom
>> > > > > >>> classes that store item similarities and migrated
item IDs
>>into
>> > > > > >>> Cassandra. But know I'm in doubt of what is normally
done
>>with
>> > this
>> > > > > >>> recommender: Should I run this as a daemon, cache
the
>> > > recommendations
>> > > > > >>> into memory and set up a web service to consult
it online?
>> > Should I
>> > > > pre
>> > > > > >>> process these recommendations for each recent user
and
>>store it
>> > > > > >>> somewhere? My first idea was storing all these recs
back
>>into
>> > > > > Cassandra,
>> > > > > >>> but looking into some classes it seems to me that
the norm
>>is
>> to
>> > > read
>> > > > > >>> the input data and store the output always using
files. Is
>> this a
>> > > > > common
>> > > > > >>> practice that benefits from HDFS?
>> > > > > >>> My use case here is something around 70k recommendations
>> requests
>> > > per
>> > > > > >>> second.
>> > > > > >>>
>> > > > > >>> Thanks in advance,
>> > > > > >>>
>> > > > > >>> --
>> > > > > >>>
>> > > > > >>> Atenciosamente
>> > > > > >>> Helder Martins
>> > > > > >>> Arquitetura do Portal e Sistemas de Backend
>> > > > > >>> +55 (51) 3284-4475
>> > > > > >>> Terra
>> > > > > >>>
>> > > > > >>>
>> > > > > >>> Esta mensagem e seus anexos se dirigem exclusivamente
ao seu
>> > > > > >> destinatário,
>> > > > > >>> podem conter informação privilegiada ou confidencial
e são
>>de
>> uso
>> > > > > >> exclusivo
>> > > > > >>> da pessoa ou entidade de destino. Se não for destinatário
>>desta
>> > > > > mensagem,
>> > > > > >>> fica notificado de que a leitura, utilização,
divulgação
>>e/ou
>> > cópia
>> > > > sem
>> > > > > >>> autorização pode estar proibida em virtude da
legislação
>> vigente.
>> > > Se
>> > > > > >>> recebeu esta mensagem por engano, pedimos que nos
o
>>comunique
>> > > > > >> imediatamente
>> > > > > >>> por esta mesma via e, em seguida, apague-a.
>> > > > > >>>
>> > > > > >>> Este mensaje y sus adjuntos se dirigen exclusivamente
a su
>> > > > > destinatario,
>> > > > > >>> puede contener información privilegiada o confidencial
y es
>> para
>> > > uso
>> > > > > >>> exclusivo de la persona o entidad de destino. Si
no es
>>usted él
>> > > > > >>> destinatario indicado, queda notificado de que la
lectura,
>> > > > utilización,
>> > > > > >>> divulgación y/o copia sin autorización puede estar
>>prohibida en
>> > > > virtud
>> > > > > de
>> > > > > >>> la legislación vigente. Si ha recibido este mensaje
por
>>error,
>> le
>> > > > > pedimos
>> > > > > >>> que nos lo comunique inmediatamente por esta misma
vía y
>> proceda
>> > a
>> > > su
>> > > > > >>> exclusión.
>> > > > > >>>
>> > > > > >>> The information contained in this transmissión
is privileged
>> and
>> > > > > >>> confidential information intended only for the use
of the
>> > > individual
>> > > > or
>> > > > > >>> entity named above. If the reader of this message
is not the
>> > > intended
>> > > > > >>> recipient, you are hereby notified that any dissemination,
>> > > > distribution
>> > > > > >> or
>> > > > > >>> copying of this communication is strictly prohibited.
If you
>> have
>> > > > > >> received
>> > > > > >>> this transmission in error, do not read it. Please
>>immediately
>> > > reply
>> > > > to
>> > > > > >> the
>> > > > > >>> sender that you have received this communication
in error
>>and
>> > then
>> > > > > delete
>> > > > > >>> it.
>> > > > > >>>
>> > > > > >>
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > BF Lyon
>> > > > http://www.nowherenearithaca.com
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Luis Carlos Guerrero Covo
>> M.S. Computer Engineering
>> (57) 3183542047
>>


Mime
View raw message