Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 72172 invoked from network); 20 Apr 2006 06:56:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 20 Apr 2006 06:56:39 -0000 Received: (qmail 72936 invoked by uid 500); 20 Apr 2006 06:56:39 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 72910 invoked by uid 500); 20 Apr 2006 06:56:38 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 72901 invoked by uid 99); 20 Apr 2006 06:56:38 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2006 23:56:38 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of avindev@gmail.com designates 66.249.92.170 as permitted sender) Received: from [66.249.92.170] (HELO uproxy.gmail.com) (66.249.92.170) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2006 23:56:38 -0700 Received: by uproxy.gmail.com with SMTP id u40so176474ugc for ; Wed, 19 Apr 2006 23:56:16 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=f3bzS3B6HHvtWhqzRTwZluPRTF1JY0SlGXD2IiubXflwXBy5BxCSv5z/8+QH2t7ayusI5+pTEZpOGl/3zNbY5Wp6ye+/1NuFKSPdUZ+bCsZxtLmz9y7vdJCINRRwaUFWAyOsuv8fsQNSctUbG9nMGYIddYqxbsyiW94PWS4ob84= Received: by 10.78.21.7 with SMTP id 7mr12800huu; Wed, 19 Apr 2006 23:56:16 -0700 (PDT) Received: by 10.78.31.19 with HTTP; Wed, 19 Apr 2006 23:56:16 -0700 (PDT) Message-ID: Date: Thu, 20 Apr 2006 14:56:16 +0800 From: Arbow To: hadoop-user@lucene.apache.org Subject: Re: How is big file got divided In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi, Lei Chen: You can have a view on org.apache.hadoop.mapred.InputFormatBase, I think it will help you. On 4/20/06, Lei Chen wrote: > Hi, > I am a new user of hadoop. This project looks cool. > > There is one question about the MapReduce. I want to process a big > file. To my understanding, hadoop will partition big file into block and > each block is assigned to a worker. Then, how does hadoop decide where to > cut those big files? Does it guarantee that each line in the input file w= ill > be assigned to one block and no line will be divided into two parts in > different blocks? > > Lei > >