Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 85021 invoked from network); 25 Oct 2010 06:19:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Oct 2010 06:19:51 -0000 Received: (qmail 84795 invoked by uid 500); 25 Oct 2010 06:19:51 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 84677 invoked by uid 500); 25 Oct 2010 06:19:50 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 84097 invoked by uid 99); 25 Oct 2010 06:19:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Oct 2010 06:19:47 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Oct 2010 06:19:45 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9P6JNV0009721 for ; Mon, 25 Oct 2010 06:19:24 GMT Message-ID: <28170148.53861287987563933.JavaMail.jira@thor> Date: Mon, 25 Oct 2010 02:19:23 -0400 (EDT) From: "Ramkumar Vadali (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Assigned: (MAPREDUCE-1892) RaidNode can allow layered policies more efficiently In-Reply-To: <20735527.26651277331411619.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali reassigned MAPREDUCE-1892: ------------------------------------------ Assignee: Ramkumar Vadali > RaidNode can allow layered policies more efficiently > ---------------------------------------------------- > > Key: MAPREDUCE-1892 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1892 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: contrib/raid > Reporter: Ramkumar Vadali > Assignee: Ramkumar Vadali > > The RaidNode policy file can have layered policies that can cover a file more than once. To avoid processing a file multiple times (for RAIDing), RaidNode maintains a list of processed files that is used to avoid duplicate processing attempts. > This is problematic in that a large number of processed files could cause the RaidNode to run out of memory. > This task proposes a better method of detecting processed files. The method is based on the observation that a more selective policy will have a better match with a file name than a less selective one. Specifically, the more selective policy will have a longer common prefix with the file name. > So to detect if a file has already been processed, the RaidNode only needs to maintain a list of processed policies and compare the lengths of the common prefixes. If the file has a longer common prefix with one of the processed policies than with the current policy, it can be assumed to be processed already. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.