Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: mapreduce-dev@hadoop.apache.org
Message-ID: <30348780.8621274908299093.JavaMail.jira@thor>
Date: Wed, 26 May 2010 17:11:39 -0400 (EDT)
From: "Ramkumar Vadali (JIRA)" <jira@apache.org>
To: mapreduce-dev@hadoop.apache.org
Subject: [jira] Created: (MAPREDUCE-1819) RaidNode should submit one job per
 Raid policy
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

RaidNode should submit one job per Raid policy
----------------------------------------------

                 Key: MAPREDUCE-1819
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1819
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: contrib/raid
    Affects Versions: 0.20.1
            Reporter: Ramkumar Vadali


The RaidNode currently computes parity files as follows:
1. Using RaidNode.selectFiles() to figure out what files to raid for a policy
2. Using #1 repeatedly for each configured policy to accumulate a list of files. 
3. Submitting a mapreduce job with the list of files from #2 using DistRaid.doDistRaid()

This task addresses the fact that #2 and #3 happen sequentially. The proposal is to submit a separate mapreduce job for the list of files for each policy and use another thread to track the progress of the submitted jobs. This will help reduce the time taken for files to be raided.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.