Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 44646 invoked from network); 1 Oct 2009 16:11:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Oct 2009 16:11:31 -0000 Received: (qmail 7307 invoked by uid 500); 1 Oct 2009 16:11:28 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 7211 invoked by uid 500); 1 Oct 2009 16:11:28 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 7201 invoked by uid 500); 1 Oct 2009 16:11:28 -0000 Delivered-To: apmail-hadoop-core-user@hadoop.apache.org Received: (qmail 7198 invoked by uid 99); 1 Oct 2009 16:11:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 16:11:28 +0000 X-ASF-Spam-Status: No, hits=-2.1 required=10.0 tests=HABEAS_ACCREDITED_SOI,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of andy.sautins@returnpath.net designates 38.109.196.9 as permitted sender) Received: from [38.109.196.9] (HELO mail.corp.returnpath.net) (38.109.196.9) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 16:11:17 +0000 Received: from mail.corp.returnpath.net (localhost.localdomain [127.0.0.1]) by mail.corp.returnpath.net (Postfix) with ESMTP id A476A250270; Thu, 1 Oct 2009 10:10:55 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=returnpath.net; h=from:to :date:subject:message-id:content-type:mime-version; s=selector1; bh=oZU3iiCTpXxShh4rB4+loq8+8LA=; b=NKo7GlwKKTOIvHKEqR0whWQGZM2N Nq6zA/w2VdQOzzZvvKuKmtHYDfTkUEE3gow5lZFjIPyAjo9b/28Hs+X0y+zGazfj ktKnjmkLqsWH9oqqankdwVtqbVOM8lIVnjLx4tPA3SfNdwYgtNrEpX7H7wDcPoXY X/jEiqQkL+IhO+o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=returnpath.net; h=from:to :date:subject:message-id:content-type:mime-version; q=dns; s= selector1; b=IdjGy8KBfrFtMmIxe1ANEjgFVFlEnLF/PMiePPbrdnAlNUhY06P AeTFXpAmdYMRHGKuJLq9eomVGb4FVqzrdZ915ME70C3RBgE78dPJHxZdaA7aDnkC FHIJhE0iPwIOJ7w0KKPa+nIhBZ4/RTkMMRk5X/g/FA5Q3ZZtBilIBHaQ= Received: from rpcoex01.rpcorp.local (unknown [10.0.1.142]) by mail.corp.returnpath.net (Postfix) with ESMTP id 9B8D3250078; Thu, 1 Oct 2009 16:10:55 +0000 (UTC) Received: from rpcoex01.rpcorp.local ([10.0.1.142]) by rpcoex01.rpcorp.local ([10.0.1.142]) with mapi; Thu, 1 Oct 2009 10:10:55 -0600 From: Andy Sautins To: "core-user@hadoop.apache.org" Date: Thu, 1 Oct 2009 10:10:53 -0600 Subject: Map/Reduce and sequence file metadata... Thread-Topic: Map/Reduce and sequence file metadata... Thread-Index: AcpCscBxobBnnoNoRkSQWtqCISWXRA== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US x-ems-proccessed: Yma8eInq5qTp77FzNR/WDA== x-ems-stamp: HbRSKHqG2R91vBdizs5jgQ== Content-Type: multipart/alternative; boundary="_000_E17A8B06D3D99B4CAA00E72B7BC41C1B0E69BDC272rpcoex01rpcor_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_E17A8B06D3D99B4CAA00E72B7BC41C1B0E69BDC272rpcoex01rpcor_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi all. I'm struggling a bit to figure this out and wondering if anyone = had any pointers. I'm using SequenceFiles as output from a MapReduce job ( using SequenceF= ileOutputFormat ) and then in a followup MapReduce job reading in the resul= ts using SequenceFileInputFormat. All seems to work fine. What I haven't = figured out is how to write the SequenceFile.Metadata in the SequenceFileOu= tputFormat and then read the metadata in SequenceFileInputFormat. Is that = possible to do using the new mapreduce.* API? I have two types of files I want to process in the Mapper. Currently I'= m using the context.getInputSplit() and parsing the resulting fileSplit.ge= tPath() to determine what file I'm processing. It seems cleaner to use the= SequenceFile.Metadata if I can. Does that make sense or am I off in the w= eeds? Thanks Andy --_000_E17A8B06D3D99B4CAA00E72B7BC41C1B0E69BDC272rpcoex01rpcor_--