Return-Path: Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: (qmail 78179 invoked from network); 15 Jan 2010 01:03:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Jan 2010 01:03:19 -0000 Received: (qmail 83599 invoked by uid 500); 15 Jan 2010 01:03:18 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 83462 invoked by uid 500); 15 Jan 2010 01:03:18 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 83415 invoked by uid 99); 15 Jan 2010 01:03:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2010 01:03:17 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2010 01:03:15 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A4276234C1EF for ; Thu, 14 Jan 2010 17:02:54 -0800 (PST) Message-ID: <1382118330.253111263517374671.JavaMail.jira@brutus.apache.org> Date: Fri, 15 Jan 2010 01:02:54 +0000 (UTC) From: "Peter Spiro (JIRA)" To: common-dev@hadoop.apache.org Subject: [jira] Created: (HADOOP-6494) MapFile.Reader does not seek to first entry for multi-valued key MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org MapFile.Reader does not seek to first entry for multi-valued key ---------------------------------------------------------------- Key: HADOOP-6494 URL: https://issues.apache.org/jira/browse/HADOOP-6494 Project: Hadoop Common Issue Type: Bug Components: io Reporter: Peter Spiro Priority: Minor When a MapFile contains a key with multiple entries and one of these entries other than the first happens to be stored in the index, then the Reader's seek() and get*() methods will generally not return the first entry, making it impossible to retrieve all of the key's entries using next(). One easy solution would be to modify the Writer's append() method to only index an entry if it's the first entry belonging to its key, e.g.: public synchronized void append(WritableComparable key, Writable val) throws IOException { boolean equalsLastKey = (size != 0 && comparator.compare(lastKey, key) == 0); checkKey(key); boolean largeEnoughInterval = size % indexInterval == 0; if (largeEnoughInterval && !equalsLastKey) { // add an index entry position.set(data.getLength()); // point to current eof index.append(key, position); } data.append(key, val); // append key/value to data if (!largeEnoughInterval || !equalsLastKey) size++; } (The size variable should then be renamed to something more accurate.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.