hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: [HADOOP-users] HowTo filter files for a Map/Reduce task over the same input folder
Date Fri, 11 Apr 2008 17:52:27 GMT

Just call addInputFile multiple times after filtering.  (or is it
addInputPath... Don't have documentation handy)


On 4/11/08 6:33 AM, "Alfonso Olias Sanz" <alfonso.olias.sanz@gmail.com>
wrote:

> Hi
> I have a general purpose input folder that it is used as input in a
> Map/Reduce task. That folder contains files grouped by names.
> 
> I want to configure the JobConf in a way I can filter the files that
> have to be processed from that pass (ie  files which name starts by
> Elementary, or Source etc)  So the task function will only process
> those files.  So if the folder contains 1000 files and only 50 start
> by Elementary. Only those 50 will be processed by my task.
> 
> I could set up different input folders and those containing the
> different files, but I cannot do that.
> 
> 
> Any idea?
> 
> thanks


Mime
View raw message