Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 40656 invoked from network); 18 Dec 2007 03:07:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Dec 2007 03:07:48 -0000 Received: (qmail 96347 invoked by uid 500); 18 Dec 2007 03:07:30 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 96323 invoked by uid 500); 18 Dec 2007 03:07:30 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 96313 invoked by uid 99); 18 Dec 2007 03:07:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Dec 2007 19:07:30 -0800 X-ASF-Spam-Status: No, hits=2.8 required=10.0 tests=RCVD_IN_DNSWL_LOW,RCVD_NUMERIC_HELO,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.50.2.13] (HELO ex9.myhostedexchange.com) (69.50.2.13) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Dec 2007 03:07:07 +0000 Received: from 206.169.1.36 ([206.169.1.36]) by ex9.hostedexchange.local ([69.50.2.13]) with Microsoft Exchange Server HTTP-DAV ; Tue, 18 Dec 2007 03:07:11 +0000 User-Agent: Microsoft-Entourage/11.3.3.061214 Date: Mon, 17 Dec 2007 19:07:05 -0800 Subject: Re: question on file, inputformats and outputformats From: Ted Dunning To: Message-ID: Thread-Topic: question on file, inputformats and outputformats Thread-Index: AchBIxFNT+gJ4q0WEdyLtAAWy8rVfQ== In-Reply-To: Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I thought that is what your input file already was. The KeyValueTextInputFormat should read your input as-is. When you write out your intermediate values, just make sure that you use TextOutputFormat and put "DIR" as the key and the directory name as the value (same with files). On 12/17/07 6:46 PM, "Jim the Standing Bear" wrote: > With KeyValueTextInputFormat, the problem is not reading it - I know > how to set the separator byte and all that... my problem is with > creating the very first file - I simply don't know how.