mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Welcome new committers: Shannon Quinn and Dmitry Lyubimov
Date Sun, 13 Feb 2011 22:13:34 GMT
Thanks Grant and everybody for the welcome,

I am certainly thrilled about being able to participate in Mahout
effort as a committer.

I am currently working for a small startup company called Inadco as an
architect. I've been looking to a scalable solution for LSI among
other things) for several past years, and renewed that effort when i
joined Inadco. We hope LSI pipeline would help us to assess document
similarities with some degree of addressing polisemy and synonymy.
Mahout has an excellent foundation to bootstrap this process:
pipelines to vectorize text documents with a custom stemmer/analyzer,
compute tf/idfs, select bigrams/trigrams based on excellent
log-likelihood method (which i think is based on Ted Dunning's
'Surprise and Coincidence' work). And all that capable running on a
Hadoop infrastructure allowing to compact incredible amount of flops
into unit of time.

My contribution builds on top of that by introducing MapReduce-only
Stochastic SVD implementation to the mix (MAHOUT-376, -593). This has
not been a big priority for the company so far, but we ran and tested
major steps of our LSI pipeline and i think we will see it thru to
production in a matter of couple months or so, along with fold-in jobs
and somewhat slightly "better-than-random-scanning" hbase-based vector
space indexing.

I think going forward we also have a great interest in dyadic
regressions with cold starts (we are in a situation where side
information is extremely sparse), as well as hierarchical document
clustering. Hopefully, some of those future efforts may result in
Mahout conributions. But that's company's roadmap, my personal roadmap
of course does not have to depend on that too closely.:)

Thanks.
-Dmitriy

On Sat, Feb 12, 2011 at 9:12 AM, Grant Ingersoll <gsingers@apache.org> wrote:
> I am pleased to announce that the Mahout PMC has, in recognition of their continued contributions
to Mahout, elected Shannon Quinn and Dmitry Lyubimov to be committers on the project.  Please
join me in giving a warm welcome!
>
> Dmitry and Shannon, it's customary for new committers to write a paragraph or so of introduction
about themselves, if you don't mind sharing a bit about yourself and how you use Mahout.
>
> Thanks,
> Grant

Mime
View raw message