mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Brickley <dan...@danbri.org>
Subject Re: Mahout for detecting fake profiles in social networks!
Date Tue, 21 Jun 2011 16:11:19 GMT
On 21 June 2011 17:52, Sebastian Schelter <ssc@apache.org> wrote:
> I guess it depends on what features you want to use to detect those fake
> profiles.

Yes, sometimes networks can be copied wholesale. So for example see
http://en.wikipedia.org/wiki/Ex.plode.us
http://brainstorm.tribe.net/thread/34fb1a79-351d-4251-8318-829623c1c9cb
... when explode.us reproduced the entire social graph of tribe.net on
a new site. Thousands of 'genuine fakes'. From the user's point of a
view these were perceived as fake copies of their real profile. From a
data structure point of view the graphs were identical, and you'd need
to use technologies like openid/oauth to address the relevant notion
of authenticity.  There is also mischief sometimes with a profile
being copied as a way of gaining trust of the profile owner's friends.

But I guess you're more looking for spam accounts etc? ie. the victim
is a site not a user.

> If you want to look at network features of the social graph there is not
> much Mahout has to offer currently. We had a patch starting a graph mining
> module recently but its only at its very beginning.

Maybe interesting re
http://www.amazon.ca/Understanding-Complex-Datasets-Mining-Decompositions/dp/1584888326
... there is a chapter in there on use of graph decompositions for
social graph analysis, and the kinds of preprocessing approaches that
have been adopted, to have social relationships more 'visible' to
later processing. (The chapter seems to be online at
http://91-641.wiki.uml.edu/file/view/graphschapter.pdf though I've no
idea if it is meant to be.). I'm curious how much of that could be
handled within Mahout's framework, but I've not got my head around the
(walk Laplacian etc etc) details.

cheers,

Dan

Mime
View raw message