hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arnoldo Muller" <arnoldomul...@gmail.com>
Subject Student Projects: Filesystem namespace partitioning
Date Thu, 09 Oct 2008 07:21:11 GMT

My name is Arnoldo Muller, I am a final year PhD candidate.
I am working on similarity search for detecting Open Source license violations
(www.furiachan.org).  In my spare time, I also code a similarity
search engine (www.obsearch.net).

In am interested in the Apache Hadoop Open Source Student Project:

"Performance evaluation of existing Locality Sensitive Hashing schemes.
Research on new hashing schemes for filesystem namespace partitioning"

If nobody is working on this, I would like to know more about the scope of the
project. Does it make sense to define a distance function so that
similar namespaces are grouped together into the same "bucket"?
If so, I have three or four metric trees that could be used for the comparison.


Arnoldo Muller

View raw message