Return-Path: Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: (qmail 34620 invoked from network); 24 Jun 2010 19:45:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 24 Jun 2010 19:45:16 -0000 Received: (qmail 63212 invoked by uid 500); 24 Jun 2010 19:45:16 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 63078 invoked by uid 500); 24 Jun 2010 19:45:15 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 63070 invoked by uid 99); 24 Jun 2010 19:45:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jun 2010 19:45:15 +0000 X-ASF-Spam-Status: No, hits=4.4 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lordjoe2000@gmail.com designates 209.85.212.176 as permitted sender) Received: from [209.85.212.176] (HELO mail-px0-f176.google.com) (209.85.212.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jun 2010 19:45:07 +0000 Received: by pxi13 with SMTP id 13so2190141pxi.35 for ; Thu, 24 Jun 2010 12:44:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=EVvgM6w1jXVYgCiNcNZMhou9W3OB2TjCqQ8RUZxMmxE=; b=ThEGC2IZXfCE3RjbiX7xWlKR2cTDx9EUSMmP1yAAPUSpKjM0OapXE3XLNX+BSudC74 VhIJRedjfmbClQ/ZKsYd26/P76DMGafOtrw4C1MzlpwgUWHNtClBMzgm9HKoYrRoh7SR jZifTm5Z5y24dRqNEvDd+F11KVGUod1dAUK38= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=GEcremj5E2gyXrCLzeq0GD11cXEXPSzwh2lHSRzxPITTch2IiDrj13G/yiFFj8G39L +k9R4ZjrXU3Xit4O3boGa2HcR5LV5rajJ8/SFYMFtLMOYAiWzXlDxHYIxCkakp4s5y7X dONL3d2WUOd4gh+aDFU9s6hjMjmy6lmQkxMlg= MIME-Version: 1.0 Received: by 10.142.1.41 with SMTP id 41mr9780214wfa.289.1277408682251; Thu, 24 Jun 2010 12:44:42 -0700 (PDT) Received: by 10.142.72.4 with HTTP; Thu, 24 Jun 2010 12:44:42 -0700 (PDT) Date: Thu, 24 Jun 2010 12:44:42 -0700 Message-ID: Subject: Custom File reader From: Steve Lewis To: mapreduce-user Content-Type: multipart/alternative; boundary=00504502bb37bd18c20489cbe28b X-Virus-Checked: Checked by ClamAV on apache.org --00504502bb37bd18c20489cbe28b Content-Type: text/plain; charset=ISO-8859-1 I have a number of files which can be read and converted into a series of lines of lext - however the means of reading the file is not known to the standard Hadoop splitters. I understand that I can Override FileInputFormat to set isSplitable to false - I am a little unclear on how to get the Job to Use my version of that FileInputFormat and nowhere do I see a place to override the code for reading the file and converting it to lines of text. Anyone know how to do this?? -- Steven M. Lewis PhD Institute for Systems Biology Seattle WA --00504502bb37bd18c20489cbe28b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I have a number of files which can be read and converted into a series of l= ines of lext - however the means of reading the
file is not known to th= e standard=A0Hadoop=A0splitters. I understand that I can Override FileInput= Format to set=A0isSplitable to false -
I am a little unclear on how to get the Job to Use my version of that= =A0FileInputFormat=A0=A0and nowhere do I see a place to
override = the code for reading the file and converting it to lines of text.=A0
<= div>Anyone know how to do this??

--
Steven M. Lewis PhD
Institute for Systems Biology
Seattle = WA
--00504502bb37bd18c20489cbe28b--