hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark question <markq2...@gmail.com>
Subject Re: re-reading
Date Wed, 08 Jun 2011 16:38:10 GMT
I have a question though for Harsh case... I wrote my custom inputFormat
which will create an array of recordReaders and give them to the MapRunner.

Will that mean multiple copies of the inputSplit are all in memory? or will
there be one copy pointed by all of them .. as if they were pointers ?

Thanks,
Mark

On Wed, Jun 8, 2011 at 9:13 AM, Mark question <markq2011@gmail.com> wrote:

> Thanks for the replies, but input doesn't have 'clone' I don't know why ...
> so I'll have to write my custom inputFormat ... I was hoping for an easier
> way though.
>
> Thank you,
> Mark
>
>
> On Wed, Jun 8, 2011 at 1:58 AM, Harsh J <harsh@cloudera.com> wrote:
>
>> Or if that does not work for any reason (haven't tried it really), try
>> writing your own InputFormat wrapper where in you can have direct
>> access to the InputSplit object to do what you want to (open two
>> record readers, and manage them separately).
>>
>> On Wed, Jun 8, 2011 at 1:48 PM, Stefan Wienert <stefan@wienert.cc> wrote:
>> > Try input.clone()...
>> >
>> > 2011/6/8 Mark question <markq2011@gmail.com>:
>> >> Hi,
>> >>
>> >>   I'm trying to read the inputSplit over and over using following
>> function
>> >> in MapperRunner:
>> >>
>> >> @Override
>> >>    public void run(RecordReader input, OutputCollector output, Reporter
>> >> reporter) throws IOException {
>> >>
>> >>   RecordReader copyInput = input;
>> >>
>> >>  //First read
>> >>   while(input.next(key,value));
>> >>
>> >>  //Second read
>> >>  while(copyInput.next(key,value));
>> >>   }
>> >>
>> >> It can clearly be seen that this won't work because both RecordReaders
>> are
>> >> actually the same. I'm trying to find a way for the second reader to
>> start
>> >> reading the split again from beginning ... How can I do that?
>> >>
>> >> Thanks,
>> >> Mark
>> >>
>> >
>> >
>> >
>> > --
>> > Stefan Wienert
>> >
>> > http://www.wienert.cc
>> > stefan@wienert.cc
>> >
>> > Telefon: +495251-2026838
>> > Mobil: +49176-40170270
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message