Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2ADB747DF for ; Wed, 25 May 2011 09:08:16 +0000 (UTC) Received: (qmail 70120 invoked by uid 500); 25 May 2011 09:08:13 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 70080 invoked by uid 500); 25 May 2011 09:08:13 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 70072 invoked by uid 99); 25 May 2011 09:08:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 May 2011 09:08:13 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [156.148.72.33] (HELO raffaello.crs4.it) (156.148.72.33) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 May 2011 09:08:05 +0000 Received: from slynx.localnet (slynx.crs4.it [156.148.72.124]) by raffaello.crs4.it (Postfix) with ESMTP id 46E157900C2 for ; Wed, 25 May 2011 11:07:44 +0200 (CEST) From: Luca Pireddu Organization: CRS4 To: common-user@hadoop.apache.org Subject: Re: Sorting ... Date: Wed, 25 May 2011 11:09:35 +0200 User-Agent: KMail/1.13.5 (Linux/2.6.35-28-generic; KDE/4.5.5; x86_64; ; ) References: <201105230933.58725.pireddu@crs4.it> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201105251109.35855.pireddu@crs4.it> On May 25, 2011 01:43:22 Mark question wrote: > Thanks Luca, but what other way to sort a directory of sequence files? > > I don't plan to write a sorting algorithm in mappers/reducers, but hoping > to use the sequenceFile.sorter instead. > > Any ideas? > > Mark Maybe this class can help? org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat With it you should be able to read (key,value) records from your sequence files and then do whatever you need with them. -- Luca Pireddu CRS4 - Distributed Computing Group Loc. Pixina Manna Edificio 1 Pula 09010 (CA), Italy Tel: +39 0709250452