hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maheshwaran Janarthanan <ashwinwa...@hotmail.com>
Subject RE: Skipping Bad Records in M/R Job
Date Tue, 09 Aug 2011 17:41:19 GMT

Aaron,

I am doing some HTML parsing and special content extraction which throws error (which can't
be handled by exception handling mechanism)!

Thanks,
Mahesh

> From: Aaron.Baff@telescope.tv
> To: common-user@hadoop.apache.org
> Date: Tue, 9 Aug 2011 10:38:37 -0700
> Subject: RE: Skipping Bad Records in M/R Job
> 
> If the 3rd party library is used as part of your Map() function, you could just catch
the appropriate Exceptions, and simply not emit that record and return from the Map() normally.
> 
> --Aaron
> -----Original Message-----
> From: Maheshwaran Janarthanan [mailto:ashwinwaran@hotmail.com]
> Sent: Tuesday, August 09, 2011 10:28 AM
> To: HADOOP USERGROUP
> Subject: Skipping Bad Records in M/R Job
> 
> 
> Hi,
> 
> I have written a Map reduce job which uses third party libraries to process unseen data
which makes job fail because of errors in records.
> 
> I realized 'Skipping Bad Records' feature in Hadoop Map/Reduce. Can Anyone send me the
code snippet which enables this feature by setting properties on JobConf
> 
> Thanks,
> Ashwin!
> 
> 
> 
> > Date: Sun, 7 Aug 2011 01:11:29 +0530
> > From: jagaran_das@yahoo.co.in
> > Subject: Help on DFSClient
> > To: common-user@hadoop.apache.org; user@pig.apache.org
> >
> > I am keeping a Stream Open and writing through it using a multithreaded application.
> > The application is in a different box and I am connecting to NN remotely.
> >
> > I was using FileSystem and getting same error and now I am trying DFSClient and
getting the same error.
> >
> > When I am running it via simple StandAlone class, it is not throwing any error but
when i put that in my Application, it is throwing this error.
> >
> > Please help me with this.
> >
> > Regards,
> > JD
> >
> >
> >  public String toString() {
> >       String s = getClass().getSimpleName();
> >       if (LOG.isTraceEnabled()) {
> >         return s + "@" + DFSClient.this + ": "
> >                + StringUtils.stringifyException(new Throwable("for testing"));
> >       }
> >       return s;
> >     }
> >
> > My Stack Trace :::
> >
> >
> > 06Aug2011 12:29:24,345 DEBUG [listenerContainer-1] (DFSClient.java:1115) - Wait
for lease checker to terminate
> > 06Aug2011 12:29:24,346 DEBUG [LeaseChecker@DFSClient[clientName=DFSClient_280246853,
ugi=jagarandas]: java.lang.Throwable: for testing
> > at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.toString(DFSClient.java:1181)
> > at org.apache.hadoop.util.Daemon.<init>(Daemon.java:38)
> > at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.put(DFSClient.java:1094)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:547)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:513)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:497)
> > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:442)
> > at com.apple.ireporter.common.persistence.ConnectionManager.createConnection(ConnectionManager.java:74)
> > at com.apple.ireporter.common.persistence.HDPPersistor.writeToHDP(HDPPersistor.java:95)
> > at com.apple.ireporter.datatransformer.translator.HDFSTranslator.persistData(HDFSTranslator.java:41)
> > at com.apple.ireporter.datatransformer.adapter.TranslatorAdapter.processData(TranslatorAdapter.java:61)
> > at com.apple.ireporter.datatransformer.DefaultMessageListener.persistValidatedData(DefaultMessageListener.java:276)
> > at com.apple.ireporter.datatransformer.DefaultMessageListener.onMessage(DefaultMessageListener.java:93)
> > at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:506)
> > at org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:463)
> > at org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:435)
> > at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:322)
> > at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:260)
> > at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:944)
> > at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:868)
> > at java.lang.Thread.run(Thread.java:680)
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message