Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 1419 invoked from network); 18 Sep 2008 23:05:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Sep 2008 23:05:37 -0000 Received: (qmail 27974 invoked by uid 500); 18 Sep 2008 23:05:32 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 27963 invoked by uid 500); 18 Sep 2008 23:05:32 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 27952 invoked by uid 99); 18 Sep 2008 23:05:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2008 16:05:32 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Sep 2008 23:04:41 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 59BC8234C1DF for ; Thu, 18 Sep 2008 16:04:44 -0700 (PDT) Message-ID: <759180462.1221779084366.JavaMail.jira@brutus> Date: Thu, 18 Sep 2008 16:04:44 -0700 (PDT) From: "Chris Douglas (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-3638) Cache the iFile index files in memory to reduce seeks during map output serving In-Reply-To: <1051499706.1214370645144.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HADOOP-3638: ---------------------------------- Affects Version/s: (was: 0.17.0) Status: Open (was: Patch Available) After discussing this with Owen and Arun, it's become clear that the LRU semantics are forcing a lot of complexity into IndexCache, particularly in its synchronization. It can be simplified substantially by observing LRC (created) semantics instead, which should be nearly as good in practice, particularly given your results with gridmix demonstrating that the memory limit will rarely be approached in practice. Unfortunately, we do need some sort of paging strategy to avoid growing the cache without bound, but a combination of ConcurrentHashMap and ConcurrentLinkedQueue- accepting the penalty for traversing the latter when an entry is removed by a job, as there should only be contention for loading/unloading instead of during reads- should be both reasonably performant and easy to verify. Given that the paging semantics will rarely be exercised by integration tests, a unit test for the cache is also necessary. > Cache the iFile index files in memory to reduce seeks during map output serving > ------------------------------------------------------------------------------- > > Key: HADOOP-3638 > URL: https://issues.apache.org/jira/browse/HADOOP-3638 > Project: Hadoop Core > Issue Type: Improvement > Components: mapred > Reporter: Devaraj Das > Assignee: Jothi Padmanabhan > Fix For: 0.19.0 > > Attachments: hadoop-3638-v1.patch, hadoop-3638-v2.patch, hadoop-3638-v3.patch, hadoop-3638-v4.patch, hadoop-3638-v5.patch, hadoop-3638-v6.patch > > > The iFile index files can be cached in memory to reduce seeks during map output serving. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.