Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 61389 invoked from network); 11 Mar 2009 05:43:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Mar 2009 05:43:20 -0000 Received: (qmail 73237 invoked by uid 500); 11 Mar 2009 05:43:13 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 73197 invoked by uid 500); 11 Mar 2009 05:43:13 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 73186 invoked by uid 99); 11 Mar 2009 05:43:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Mar 2009 22:43:13 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Mar 2009 05:43:11 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 87415234C004 for ; Tue, 10 Mar 2009 22:42:50 -0700 (PDT) Message-ID: <303865384.1236750170552.JavaMail.jira@brutus> Date: Tue, 10 Mar 2009 22:42:50 -0700 (PDT) From: "Chris Douglas (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Created: (HADOOP-5459) CRC errors not detected reading intermediate output into memory with problematic length MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org CRC errors not detected reading intermediate output into memory with problematic length --------------------------------------------------------------------------------------- Key: HADOOP-5459 URL: https://issues.apache.org/jira/browse/HADOOP-5459 Project: Hadoop Core Issue Type: Bug Affects Versions: 0.20.0 Reporter: Chris Douglas Priority: Blocker It's possible that the expected, uncompressed length of the segment is less than the available/decompressed data. This can happen in some worst-cases for compression, but it is exceedingly rare. It is also possible (though also fantastically unlikely) for the data to deflate to a size greater than that reported by the map. CRC errors will remain undetected because IFileInputStream does not validate the checksum until the end of the stream, and close() does not advance the stream to the end of the segment. The (abbreviated) read loop fetching data in shuffleInMemory: {code} int n = input.read(shuffleData, 0, shuffleData.length); while (n > 0) { bytesRead += n; n = input.read(shuffleData, bytesRead, (shuffleData.length-bytesRead)); } {code} Will read only up to the expected length. Without reading the whole segment, the checksum is not validated. Even if IFileInputStream instances are closed, they should always validate checksums. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.