Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 66610 invoked from network); 25 Sep 2008 17:50:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Sep 2008 17:50:30 -0000 Received: (qmail 26076 invoked by uid 500); 25 Sep 2008 17:50:24 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 25192 invoked by uid 500); 25 Sep 2008 17:50:22 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 25180 invoked by uid 99); 25 Sep 2008 17:50:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Sep 2008 10:50:22 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nathan@rapleaf.com designates 208.96.16.213 as permitted sender) Received: from [208.96.16.213] (HELO mail.rapleaf.com) (208.96.16.213) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Sep 2008 17:49:21 +0000 Received: from mail.rapleaf.com (localhost.localdomain [127.0.0.1]) by mail.rapleaf.com (Postfix) with ESMTP id 71A3112502DD for ; Thu, 25 Sep 2008 10:49:25 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=rapleaf.com; q=dns; s=m1; b=PJr5+ sw8mjC8pPFCK97Qtx7wFEIKaBxvqxBp3sW1dtOveM5Ryz9hot+LJNm2lUY8TM0UR 5HyuOMOdf3YJhhxEK1fP6dGlSA123JzGmV5qTHsNlYHqrn299gxYMQcb+MdAsGnd CqDJAL4wn8L6MMnPlOmYgNTo/evE0kq5Ht732s= Received: from [192.168.0.110] (unknown [192.168.0.110]) by mail.rapleaf.com (Postfix) with ESMTP id 4D1AC12500BA for ; Thu, 25 Sep 2008 10:49:25 -0700 (PDT) Message-Id: From: Nathan Marz To: core-user@hadoop.apache.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Custom input format getSplits being called twice Date: Thu, 25 Sep 2008 10:49:23 -0700 X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org Hello all, I am getting some odd behavior from hadoop which seems like a bug. I have created a custom input format, and I am observing that my "getSplits" method is being called twice. Each call is on a different instance of the input format. The job, however, is only run once, using the result from the second call to getSplits. The first call receives the numSplits hint as expected, while in the second call that value is overriden to 1. I am running hadoop in standalone mode. Does anyone know anything about this issue? Thanks, Nathan Marz Rapleaf