hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1892) RaidNode can allow layered policies more efficiently
Date Tue, 02 Nov 2010 19:02:26 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927548#action_12927548

Hudson commented on MAPREDUCE-1892:

Integrated in Hadoop-Mapreduce-trunk-Commit #527 (See [https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/527/])
    MAPREDUCE-1892. RaidNode can allow layered policies more efficiently.
(Ramkumar Vadali via schen)

> RaidNode can allow layered policies more efficiently
> ----------------------------------------------------
>                 Key: MAPREDUCE-1892
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1892
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/raid
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>             Fix For: 0.22.0
>         Attachments: MAPREDUCE-1892.patch, MAPREDUCE-1892.patch
> The RaidNode policy file can have layered policies that can cover a file more than once.
To avoid processing a file multiple times (for RAIDing), RaidNode maintains a list of processed
files that is used to avoid duplicate processing attempts.
> This is problematic in that a large number of processed files could cause the RaidNode
to run out of memory.
> This task proposes a better method of detecting processed files. The method is based
on the observation that a more selective policy will have a better match with a file name
than a less selective one. Specifically, the more selective policy will have a longer common
prefix with the file name.
> So to detect if a file has already been processed, the RaidNode only needs to maintain
a list of processed policies and compare the lengths of the common prefixes. If the file has
a longer common prefix with one of the processed policies than with the current policy, it
can be assumed to be processed already.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message