mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <>
Subject Re: Proposal for additional features in Mahout (minkowski Distance, mahalobnis Distance and K-nearest neighbor classifier)
Date Sun, 18 May 2014 18:01:31 GMT
Please first check the Mahout codebase as to what's presently out there. All of the distance
measures u mention - Mahalonobis, Minkowski etc... already exist. see o.a.m.common.distance

On Sunday, May 18, 2014 1:56 PM, Arunav Sanyal <> wrote:


I am new to apache mahout and would like to contribute in whatever humble
way I can.

I see that the Vector class in Apache Mahout does not have the
functionality of minkowski distance.

is a distance metric which generalizes distance measures between any two
vectors. It can represent hamming distance, euclidean distance depending on
parameters. I already have a simple solution ready for review if this is
approved. Similarly I am working on the more generic Mahalobnis distance

My primary motive for introducing these distance measures is to come up
with a generic implementation of the K-nearest neighbor classifier (not to
be confused K-means clustering). I will be working on that as well shortly.

If somebody else is working towards
 these features, I would like to
collaborate and donate whatever code patches that they deem necessary. If
not, I humbly request that the community approve these for inclusion into
apache mahout.

Yours sincerely
Arunav Sanyal
Arunav Sanyal
Graduate student
B.E (Hons) Computer Science
BITS Pilani K.K Birla Goa Campus

Software Engineer
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message