mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <suneel_mar...@yahoo.com>
Subject Re: Proposal for additional features in Mahout (minkowski Distance, mahalobnis Distance and K-nearest neighbor classifier)
Date Sun, 18 May 2014 18:01:31 GMT
Please first check the Mahout codebase as to what's presently out there. All of the distance
measures u mention - Mahalonobis, Minkowski etc... already exist. see o.a.m.common.distance
.



On Sunday, May 18, 2014 1:56 PM, Arunav Sanyal <arunav.sanyal91@gmail.com> wrote:
 


Hi

I am new to apache mahout and would like to contribute in whatever humble
way I can.

I see that the Vector class in Apache Mahout does not have the
functionality of minkowski distance.

http://en.wikipedia.org/wiki/Minkowski_distance

is a distance metric which generalizes distance measures between any two
vectors. It can represent hamming distance, euclidean distance depending on
parameters. I already have a simple solution ready for review if this is
approved. Similarly I am working on the more generic Mahalobnis distance
measure.

My primary motive for introducing these distance measures is to come up
with a generic implementation of the K-nearest neighbor classifier (not to
be confused K-means clustering). I will be working on that as well shortly.

If somebody else is working towards
 these features, I would like to
collaborate and donate whatever code patches that they deem necessary. If
not, I humbly request that the community approve these for inclusion into
apache mahout.


Yours sincerely
Arunav Sanyal
-- 
Arunav Sanyal
Graduate student
B.E (Hons) Computer Science
BITS Pilani K.K Birla Goa Campus

Software Engineer
INFORMATICA BUSINESS SOLUTIONS
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message