mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Said <>
Subject RE: Moving a twitter conversation to the mailing list
Date Mon, 08 Nov 2010 13:10:50 GMT
As Sebastian mentions I'm going to try to make a scalable implementation. Being a Hadoop/Mahout
newbie however I'm not really sure how difficult this might end up being.

I intend to do a very general implementation which could be used for (Hy)PLSA as described

M.Sc.(Eng.) Alan Said
Compentence Center Information Retrieval & Machine Learning 
Technische Universit├Ąt Berlin / DAI-Lab 
Sekr. TEL 14 Ernst-Reuter-Platz 7
10587 Berlin / Germany
Phone:  0049 - 30 - 314 74072
Fax:    0049 - 30 - 314 74003

-----Original Message-----
From: Sebastian Schelter [] 
Sent: Monday, November 08, 2010 1:01 PM
Subject: Moving a twitter conversation to the mailing list

I'm moving a twitter conversation to the mailing list so that it doesn't 
vanish in the short-lived microblogging sphere.

To summarize, @alansaid is looking for an implementation of the 
EM-algorithm as described here: 
I could only point him to an unsuccessful implementation of PLSI tried 
at While this one 
worked for tiny examples, it clearly didn't scale and it had some parts 
of the algorithm wrong IMHO. @sbourke tweeted about using it besides 
scalability issues but I would clearly discourage anyone from doing this.

Nevertheless if Alan manages to make this work and scale I think it 
would make a very nice contribution to Mahout. I guess we'd be willing 
to help, so Alan, if you need support, just ask on dev@. There's also a 
mahout hackathon planned in Berlin, maybe that would be a good 
opportunity work collaboratively on that implementation.


View raw message