Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E99F61BDE for ; Tue, 26 Apr 2011 18:50:27 +0000 (UTC) Received: (qmail 27994 invoked by uid 500); 26 Apr 2011 18:50:25 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 27953 invoked by uid 500); 26 Apr 2011 18:50:25 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 27943 invoked by uid 99); 26 Apr 2011 18:50:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 18:50:25 +0000 X-ASF-Spam-Status: No, hits=3.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of markq2011@gmail.com designates 209.85.213.48 as permitted sender) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 18:50:18 +0000 Received: by ywk9 with SMTP id 9so485305ywk.35 for ; Tue, 26 Apr 2011 11:49:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=9K+Ndhd94XMZkmOGHY/2PHFnbayZoxe51osS7VGkXSo=; b=rGHUxvwlP/HpZZnNwJXxcdS+4gt8ArhDhIKzl2wP+7xuM4umPJHsgcEX9VTJrWCefv p/WsnxR4MoFsoGPkn2SUUqFTF/jb+JFC41F6jPqgV5yjrPguhBznuqOcjbIEPeROUOIt l6VrtmxwlvmAMNxUT7IAjxczijBOV9GiXD4j0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=kVJbENmx5C9uMiaC8AA8lGHVjQK1Cr4S35h7KmA2uLTl+d8NI0e2Gtseg6SHcoIxGl SuQ3il1Bu17tFnQMOBsFJc+TvBYlWmCkJaJmCtLTJPR3UcYsYsVCS5yMw9QpkKjeTBFm HqpuYY2vAAiH9bN+hWYhQwD/XZgF8a0Trtkik= MIME-Version: 1.0 Received: by 10.101.28.8 with SMTP id f8mr720120anj.68.1303843797870; Tue, 26 Apr 2011 11:49:57 -0700 (PDT) Received: by 10.100.31.17 with HTTP; Tue, 26 Apr 2011 11:49:57 -0700 (PDT) Date: Tue, 26 Apr 2011 11:49:57 -0700 Message-ID: Subject: Reading from File From: Mark question To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016368e1ee26a2b0704a1d6cadd --0016368e1ee26a2b0704a1d6cadd Content-Type: text/plain; charset=ISO-8859-1 Hi, My mapper opens a file and read records using next() . However, I want to stop reading if there is no memory available. What confuses me here is that even though I'm reading record by record with next(), hadoop actually reads them in dfs.block.size. So, I have two questions: 1. Is it true that even if I set dfs.block.size to 512 MB, then at least one block is loaded in memory for mapper to process (part of inputSplit)? 2. How can I read multiple records from a sequenceFile at once and will it make a difference ? Thanks, Mark --0016368e1ee26a2b0704a1d6cadd--