mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raviv Pavel <ra...@gigya-inc.com>
Subject Re: Clustering user profiles
Date Fri, 13 Jan 2012 16:57:22 GMT
True.
That's why I think need a different distance measure for each attribute of
the user.
The distance between coordinates on earth is different from distance
between ages which in turn is different from the distance between two sets
of values

I think the only solution would be do develop a custom distance measure
that's aware of the "meaning" of each dimension(s) and return the distance
accordingly.
Unless there is a way to vectorize user profiles in such a way that will
allow me to use one of the built in distance measures.




*
*
*--*Raviv



On Fri, Jan 13, 2012 at 6:44 PM, Jeff Eastman <jdog@windwardsolutions.com>wrote:

> Just remember that Longitude is a spherical coordinate and +179 is closer
> to -179 than their numeric difference. Latitude is spherical too but +89 is
> indeed quite far from -89.
>
>
>
> On 1/13/12 4:36 AM, StreetCat wrote:
>
>> The raw data had location expressed as strings such as "Paris, France" and
>> I translated them into coordinates, so measuring the distance between two
>> users' location would be trivial.
>>
>>
>> On Fri, Jan 13, 2012 at 1:19 PM, Dan Brickley<danbri@danbri.org>  wrote:
>>
>>  On 13 January 2012 12:02, Robert Stewart<bstewart.ny@gmail.com>  wrote:
>>>
>>>> Rather than using Gender as a single dimension, why not make Male and
>>>>
>>> Female as separate dimensions, with values 0 or 1 if True or False?
>>>
>>>  d[1] = 15.5 (latitude)
>>>>>> d[2] = 50.5 (longitude)
>>>>>>
>>>>> Raw lat/long can be rather cryptic. The Geonames folk have Web
>>> services (and/or downloadable data) that maps these to more socially
>>> relevant entities.
>>>
>>> See http://www.geonames.org/**export/web-services.html#**findNearby<http://www.geonames.org/export/web-services.html#findNearby>
>>> e.g.
>>> http://api.geonames.org/**extendedFindNearby?lat=47.3&**
>>> lng=9&username=demo<http://api.geonames.org/extendedFindNearby?lat=47.3&lng=9&username=demo>
>>>
>>> There's also a lat/long to Wikipedia entry service, see
>>>
>>> http://www.geonames.org/**export/wikipedia-webservice.**
>>> html#findNearbyWikipedia<http://www.geonames.org/export/wikipedia-webservice.html#findNearbyWikipedia>
>>> ...which will get you entities know to DBpedia, Freebase etc.,
>>> allowing more national or regional features to be folded in if needed.
>>>
>>> Why have the machine learning layers re-learn stuff that can just be
>>> looked up in a free encyclopaedia? Better to enrich than
>>> rediscover...?
>>>
>>> cheers,
>>>
>>> Dan
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message