Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 11526 invoked from network); 9 Oct 2008 11:50:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Oct 2008 11:50:13 -0000 Received: (qmail 61323 invoked by uid 500); 9 Oct 2008 11:50:11 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 61305 invoked by uid 500); 9 Oct 2008 11:50:11 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 61289 invoked by uid 99); 9 Oct 2008 11:50:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Oct 2008 04:50:11 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_SECURITYSAGE,SPF_PASS,SUBJECT_FUZZY_TION X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arnoldomuller@gmail.com designates 72.14.220.153 as permitted sender) Received: from [72.14.220.153] (HELO fg-out-1718.google.com) (72.14.220.153) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Oct 2008 11:49:07 +0000 Received: by fg-out-1718.google.com with SMTP id l26so2901446fgb.35 for ; Thu, 09 Oct 2008 04:49:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type:content-transfer-encoding :content-disposition; bh=ARDGCjP7f72LHXPcSYndr0g2U4cqBvo5Y6WRXF/y4UI=; b=j/1L2fAqGWX05hm+bOxLKBqU8ARE1lp0T1cClSf8Dgw/El1G1McPUPPoQaP3SWlvQ+ Vf09yYc01S/zDTBTZ6fKcb1yC5xaxG398cXvOvN1tEvteghoSESAxhLKJUIGa18z0WPF Pbme3wHekQWQvNfQtNX93sNZF0CRh8/RnRUdo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition; b=k2ZzmfhfMYrMAe5+mZcs+inq8dOZpbcFDsfpIJID5mVg/jaQSobeOIZT49JRDGn6Mr DepR6Y2hXzFYAvApCvzUascfOXf0G+hqpG0VMmmpKaWEo2OpLhhiLwEo7otl5ciwROkU 7FNl2lAiyWVOnSnXN9IIy9bcW5IXpUs3u2YYo= Received: by 10.86.80.17 with SMTP id d17mr87376fgb.47.1223552982291; Thu, 09 Oct 2008 04:49:42 -0700 (PDT) Received: by 10.86.59.12 with HTTP; Thu, 9 Oct 2008 04:49:42 -0700 (PDT) Message-ID: <7921cc8b0810090449o426e1985i763b53c1b50e067e@mail.gmail.com> Date: Thu, 9 Oct 2008 20:49:42 +0900 From: "Arnoldo Muller" To: core-dev@hadoop.apache.org Subject: Student Project: Filesystem namespace partitioning MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Checked: Checked by ClamAV on apache.org Hello, My name is Arnoldo Muller, I am a final year PhD candidate. I am working on similarity search for detecting Open Source license violations (www.furiachan.org). In my spare time, I also code a similarity search engine (www.obsearch.net). In am interested in the Apache Hadoop Open Source Student Project: "Performance evaluation of existing Locality Sensitive Hashing schemes. Research on new hashing schemes for filesystem namespace partitioning" If nobody is working on this, I would like to know more about the scope of the project. Do you want to define a distance function so that similar namespaces are grouped together into the same "bucket"? If so, I have three or four metric trees that could be used for the comparison. Thanks, Arnoldo Muller