hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1892) RaidNode can allow layered policies more efficiently
Date Tue, 02 Nov 2010 01:14:32 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927245#action_12927245
] 

Scott Chen commented on MAPREDUCE-1892:
---------------------------------------

{code}
+    List<PolicyInfo> allPolicies = null;
{code}
We can remove this field because it is not used.

+1 Looks good to me.

> RaidNode can allow layered policies more efficiently
> ----------------------------------------------------
>
>                 Key: MAPREDUCE-1892
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1892
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/raid
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-1892.patch
>
>
> The RaidNode policy file can have layered policies that can cover a file more than once.
To avoid processing a file multiple times (for RAIDing), RaidNode maintains a list of processed
files that is used to avoid duplicate processing attempts.
> This is problematic in that a large number of processed files could cause the RaidNode
to run out of memory.
> This task proposes a better method of detecting processed files. The method is based
on the observation that a more selective policy will have a better match with a file name
than a less selective one. Specifically, the more selective policy will have a longer common
prefix with the file name.
> So to detect if a file has already been processed, the RaidNode only needs to maintain
a list of processed policies and compare the lengths of the common prefixes. If the file has
a longer common prefix with one of the processed policies than with the current policy, it
can be assumed to be processed already.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message