hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark question <markq2...@gmail.com>
Subject Re: re-reading
Date Wed, 08 Jun 2011 16:13:35 GMT
Thanks for the replies, but input doesn't have 'clone' I don't know why ...
so I'll have to write my custom inputFormat ... I was hoping for an easier
way though.

Thank you,
Mark

On Wed, Jun 8, 2011 at 1:58 AM, Harsh J <harsh@cloudera.com> wrote:

> Or if that does not work for any reason (haven't tried it really), try
> writing your own InputFormat wrapper where in you can have direct
> access to the InputSplit object to do what you want to (open two
> record readers, and manage them separately).
>
> On Wed, Jun 8, 2011 at 1:48 PM, Stefan Wienert <stefan@wienert.cc> wrote:
> > Try input.clone()...
> >
> > 2011/6/8 Mark question <markq2011@gmail.com>:
> >> Hi,
> >>
> >>   I'm trying to read the inputSplit over and over using following
> function
> >> in MapperRunner:
> >>
> >> @Override
> >>    public void run(RecordReader input, OutputCollector output, Reporter
> >> reporter) throws IOException {
> >>
> >>   RecordReader copyInput = input;
> >>
> >>  //First read
> >>   while(input.next(key,value));
> >>
> >>  //Second read
> >>  while(copyInput.next(key,value));
> >>   }
> >>
> >> It can clearly be seen that this won't work because both RecordReaders
> are
> >> actually the same. I'm trying to find a way for the second reader to
> start
> >> reading the split again from beginning ... How can I do that?
> >>
> >> Thanks,
> >> Mark
> >>
> >
> >
> >
> > --
> > Stefan Wienert
> >
> > http://www.wienert.cc
> > stefan@wienert.cc
> >
> > Telefon: +495251-2026838
> > Mobil: +49176-40170270
> >
>
>
>
> --
> Harsh J
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message