hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Jennings III <raymondj...@yahoo.com>
Subject Re: Passing whole text file to a single map
Date Sat, 23 Jan 2010 14:54:16 GMT
Not sure if this solves your problem but I had a similar case where there was unique data at
the beginning of the file and if that file was split between maps I would lose that for the
2nd and subsequent maps.  I was able to pull the file name from the conf and read the first
two lines for every map.

--- On Sat, 1/23/10, stolikp <stolikp@o2.pl> wrote:

> From: stolikp <stolikp@o2.pl>
> Subject: Passing whole text file to a single map
> To: core-user@hadoop.apache.org
> Date: Saturday, January 23, 2010, 9:49 AM
> I've got some text files in my input directory and I want
> to pass each single
> text file (whole file not just a line) to a map (one file
> per one map). How
> can I do this ? TextInputFormat splits text into lines and
> I do not want
> this to happen.
> I tried:
> http://hadoop.apache.org/common/docs/r0.20./streaming.html#How+do+I+process+files%2C+one+per+map%3F
> but it doesn't work for me, compiler doesn't know what
> NonSplitableTextInputFormat.class is.
> I'm using hadoop 0.20.1 
> -- 
> View this message in context: http://old.nabble.com/Passing-whole-text-file-to-a-single-map-tp27286204p27286204.html
> Sent from the Hadoop core-user mailing list archive at
> Nabble.com.


View raw message