hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: Bad records
Date Sat, 07 Jul 2012 22:55:25 GMT
The job is failing because of exceptions parsing records, presumably. Trace
your exception from logs, wrap the parsing code that is failing in
try/catch. Increment counters and continue in your catch. Consider adding a
record check as the first thing your mapper does.

On Sat, Jul 7, 2012 at 3:21 PM, Abhishek <abhishek.dodda1@gmail.com> wrote:

> hi Russell,
>
> Thanks for the answer, can I know how would I skip bad records in
> mapreduce code
>
> Regards
> Abhi
>
> Sent from my iPhone
>
> On Jul 7, 2012, at 5:22 PM, Russell Jurney <russell.jurney@gmail.com>
> wrote:
>
> > Throw, catch and handle an exception on bad records.  Don't error out.
>  Log
> > the error in your exception handler, increment a counter.
> >
> > For general discussion, see:
> >
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
> >
> > On Sat, Jul 7, 2012 at 1:41 PM, Abhishek <abhishek.dodda1@gmail.com>
> wrote:
> >
> >> Hi all,
> >>
> >> If the job is failing because of some bad records.How would I know which
> >> records are bad.Can I put them in log file and skip those records
> >>
> >> Regards
> >> Abhi
> >>
> >>
> >> Sent from my iPhone
> >>
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message