hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mahajan, Neeraj" <nemaha...@ebay.com>
Subject RE: pattern for input files in MapReduce
Date Tue, 10 Jul 2007 17:06:09 GMT
You can extend FileInputFormat and override listPaths(). Depending on
your requirements, you might be able to use almost the same code that is
in  FileInputFormat#listPaths() and only define a new filter instead of
hiddenFileFilter.
You will have to set this new file input format when you create the job
conf.
		conf.setInputFormat(YOURInputFormat.class);

~ Neeraj

-----Original Message-----
From: Sandhya E [mailto:sandhyabhaskar@gmail.com] 
Sent: Tuesday, July 10, 2007 2:23 AM
To: hadoop-user@lucene.apache.org
Subject: pattern for input files in MapReduce

Hi

I'm using the latest version of Hadoop. Does it support specifying a
pattern for input file names, apart from specifying an  input path thru
jobConf.setInputPath(). In my case, logfiles for over a month are stored
in a single folder with date+hour embedded in their names, and I want
mapreduce to run on one day's logs at a time.

TIA
Sandhya

Mime
View raw message