Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AF81C9BE0 for ; Mon, 3 Sep 2012 15:08:38 +0000 (UTC) Received: (qmail 16183 invoked by uid 500); 3 Sep 2012 15:08:34 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 16059 invoked by uid 500); 3 Sep 2012 15:08:33 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 16052 invoked by uid 99); 3 Sep 2012 15:08:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Sep 2012 15:08:33 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuri.rwx@gmail.com designates 209.85.216.176 as permitted sender) Received: from [209.85.216.176] (HELO mail-qc0-f176.google.com) (209.85.216.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Sep 2012 15:08:27 +0000 Received: by qcsc21 with SMTP id c21so4188051qcs.35 for ; Mon, 03 Sep 2012 08:08:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=QTJbHL4Zx68slseo2aJGLjuCE05/4+84LaCXzdveFl0=; b=rcrYTbq2z2+gn8+RrUg9lSdmY/7bt5ryXjmPq33/daplZt7mWiAuSMax2BKRblv7GY JqP6cB5kEec46TABRm8ibPR4qt3CD8SFEn/7qiAgP0rnpa1HXM8czlkNvJ1s02T/VJor MnLqjV2wlUPZjE0onTson8XZFkawlfTUV4syy9bSKhaUZSzEzc3vnqtm4If0ivE8SQhK qS26DLDWXLIFG7yrKLf5yNUV9pguueBaPf1iVPiw+SmGjaeHrppKm9pNBBV0ybtL5Fx0 Nvaol88gss7fD4v4GRldXP2e9XKrzYtGMJX6Ij/XtDATQ1WZvrFKlqm8HKffZoOC3wCW kNDw== MIME-Version: 1.0 Received: by 10.229.135.17 with SMTP id l17mr9609896qct.149.1346684886918; Mon, 03 Sep 2012 08:08:06 -0700 (PDT) Received: by 10.224.181.20 with HTTP; Mon, 3 Sep 2012 08:08:06 -0700 (PDT) In-Reply-To: References: Date: Mon, 3 Sep 2012 17:08:06 +0200 Message-ID: Subject: Re: reading a binary file From: Francesco Silvestri To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00248c7691464f27d104c8cd82c2 X-Virus-Checked: Checked by ClamAV on apache.org --00248c7691464f27d104c8cd82c2 Content-Type: text/plain; charset=ISO-8859-1 Hi Mohammad, SequenceFileInputFormat requires the file to be a sequence of key/value stored in binary (i.e., the key is stored in the file). In my case, the key is implicitly given by the position of the value within the file. Thank you, Francesco On Mon, Sep 3, 2012 at 5:01 PM, Mohammad Tariq wrote: > Hello Francesco, > > Have a look at SequenceFileInputFormat : > http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/lib/input/SequenceFileInputFormat.html > > Regards, > Mohammad Tariq > > > > On Mon, Sep 3, 2012 at 8:26 PM, Francesco Silvestri wrote: > >> Hello, >> >> I have a binary file of integers and I would like an input format that >> generates pairs , where value is an integer in the file and key >> the position of the integer in the file. Which class should I use? (i.e. >> I'm looking for a kind of TextinputFormat for binary files) >> >> Thank you for your consideration, >> >> Francesco >> > > --00248c7691464f27d104c8cd82c2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi=A0Mohammad,

<= /span>
SequenceFileInputFormat=A0requires the file to be a sequence of key= /value stored in binary (i.e., the key is stored in the file). In my case, = the key is implicitly given by the position of the value within the file.

Thank you,
Francesco



On Mon, Sep 3, 2012 at 5:01 PM, Mohammad Tariq <don= tariq@gmail.com> wrote:
Hello=A0Francesco,


Regards,
=A0=A0 =A0Mohammad Tariq


On Mon, Sep 3, 2012 at 8:26 PM, Francesc= o Silvestri <yuri.rwx@gmail.com> wrote:
Hello,

I have a binary file of integers and I would like= an input format that generates pairs <key,value>, where value is an = integer in the file and key the position of the integer in the file. Which = class should I use? (i.e. I'm looking for a kind of TextinputFormat for= binary files)

Thank you for your consideration,

<= div>Francesco


--00248c7691464f27d104c8cd82c2--