hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kris Nuttycombe <kris.nuttyco...@gmail.com>
Subject Re: Configured & PathFilter
Date Mon, 12 Apr 2010 23:05:27 GMT
Whoops, so much for that idea. The Configuration instance being passed
to setConf is null.

I am utterly baffled. Is there seriously nobody out there using
PathFilter in this way? Everyone's just using dumb PathFilter
instances that don't have any configurable functionality?

/me boggles.

Kris

On Mon, Apr 12, 2010 at 2:03 PM, Kris Nuttycombe
<kris.nuttycombe@gmail.com> wrote:
> I just dove into the source, and it looks like the PathFilter instance
> is instantiated using ReflectionUtils, and setConf is called so if the
> resulting PathFilter instance implements Configurable, then
> configuration will be available.
>
> Kris
>
> On Mon, Apr 12, 2010 at 1:52 PM, Kris Nuttycombe
> <kris.nuttycombe@gmail.com> wrote:
>> static void     setInputPathFilter(Job job, Class<? extends PathFilter> filter)
>>
>> This indicates that reflection will be used to instantiate the
>> required PathFilter object, and I need to be able to access the
>> minimum and maximum date for a given run. I don't want to have to
>> implement a separate PathFilter class for each set of dates,
>> obviously.
>>
>> Thanks,
>>
>> Kris
>>
>> On Mon, Apr 12, 2010 at 9:35 AM, Jeff Zhang <zjffdu@gmail.com> wrote:
>>>  Hi Kris,
>>>
>>> Do you mean you want to use the PathFilter in map or reduce task ? Or you
>>> mean using the PathFilter in InputFormat ?
>>> I guess you mean the second case, if so you only need to call
>>> FileInputFormat.setInputPathFilter(,) to provide the filter information.
>>>
>>>
>>> On Mon, Apr 12, 2010 at 8:13 AM, Kris Nuttycombe <kris.nuttycombe@gmail.com>
>>> wrote:
>>>>
>>>> Hi, all, quick question about using PathFilter.
>>>>
>>>> Is there any way to provide information from the job configuration to
>>>> a PathFilter instance? In my case, I want to  limit the date range of
>>>> the files being selected by the filter, and don't want to have to
>>>> hard-code a separate PathFilter instance for each date range I'm
>>>> interested in, obviously. If I make my PathFilter extend Configured,
>>>> will it do the right thing?
>>>>
>>>> Thanks!
>>>>
>>>> Kris
>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>
>

Mime
View raw message