mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praneet mhatre <>
Subject High Dimensional Datasets for Binary Classification
Date Wed, 09 May 2012 03:06:29 GMT
Hi All / Ted,

I tried looking through the mailing list first, since similar questions
have been asked before. But couldn't really find what I wanted.

Quick background - I have been working on higher order learning algorithms
(Feature Sharding to be specific) for some time. While getting this stuff
into Mahout will require some solid progress on the pig/mahout integration
front among other things, I have been exploring how vertical sharding
generally affects classifier performance using some simple code I've
written in Weka.

Most of my studies so far have been done on moderate dimensional datasets.
Can someone please suggest me some high/very high dimensional datasets
suitable for binary classification and available for free?

Thank you!

Praneet Mhatre
Graduate Student
Donald Bren School of ICS
University of California, Irvine

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message