hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3638) Cache the iFile index files in memory to reduce seeks during map output serving
Date Thu, 18 Sep 2008 23:04:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Douglas updated HADOOP-3638:

    Affects Version/s:     (was: 0.17.0)
               Status: Open  (was: Patch Available)

After discussing this with Owen and Arun, it's become clear that the LRU semantics are forcing
a lot of complexity into IndexCache, particularly in its synchronization. It can be simplified
substantially by observing LRC (created) semantics instead, which should be nearly as good
in practice, particularly given your results with gridmix demonstrating that the memory limit
will rarely be approached in practice. Unfortunately, we do need some sort of paging strategy
to avoid growing the cache without bound, but a combination of ConcurrentHashMap and ConcurrentLinkedQueue-
accepting the penalty for traversing the latter when an entry is removed by a job, as there
should only be contention for loading/unloading instead of during reads- should be both reasonably
performant and easy to verify.

Given that the paging semantics will rarely be exercised by integration tests, a unit test
for the cache is also necessary.

> Cache the iFile index files in memory to reduce seeks during map output serving
> -------------------------------------------------------------------------------
>                 Key: HADOOP-3638
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3638
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Jothi Padmanabhan
>             Fix For: 0.19.0
>         Attachments: hadoop-3638-v1.patch, hadoop-3638-v2.patch, hadoop-3638-v3.patch,
hadoop-3638-v4.patch, hadoop-3638-v5.patch, hadoop-3638-v6.patch
> The iFile index files can be cached in memory to reduce seeks during map output serving.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message