hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Olston <ols...@yahoo-inc.com>
Subject Re: Proposal for error handling in Pig
Date Sat, 01 Mar 2008 00:21:24 GMT
The trickiest case is where a user-defined function fails on some,  
but not all inputs. Do you:

a) fail the entire job
b) log the errors, omit those inputs, and keep on processing  
(questionable course of action in the case of non-monotonic queries/ 
programs e.g. set difference, as well as aggregation like COUNT)
c) log the errors, and insert NULLs into the stream (perhaps the best  
option, especially given that we're going to support nulls anyway)
d) support some subset of (a,b,c) and permit the user to choose on a  
per-job basis

?

-Chris

On Feb 29, 2008, at 11:10 AM, Olga Natkovich wrote:

>
>
>> -----Original Message-----
>> From: Benjamin Francisoud [mailto:benjamin.francisoud@joost.com]
>> Sent: Friday, February 29, 2008 7:05 AM
>> To: pig-dev@incubator.apache.org
>> Subject: Re: Proposal for error handling in Pig
>>
>> About Internal Errors, do you consider such code to be part of them ?
>>
>> public void something(Object object) {
>>     if (o == null) {
>>         throw new IllegalArgumentException("Object can't be null");
>>     }
>>     ...
>> }
>>
>> class StateMachine {
>>     public void start() {...}
>>     public void end() {
>>         if (startCalled == false) {
>>             throw new IllegalStateException("You didn't call
>> start()");
>>         }
>>     }
>> }
>>
>> About user errors, how should we handle them ?
>> The way I proposed in PIG-100 (1) ?
>
> Yes, that's fine. I personally don't see a strong reason to log the  
> exception stack in this case but I am fine with doing it if others  
> find it helpful. I will update the doc to include this information.
>
>>
>> try {
>>     plan = parser.Parse();
>> } catch (ParseException e) {
>>     log.error(e.getMessage());
>>     log.debug(e);
>> }
>>
>>
>>
>> [1]
>> https://issues.apache.org/jira/browse/PIG-100?focusedCommentId
>> =12573218#action_12573218
>>
>> Olga Natkovich a écrit :
>>> Pig developers,
>>>
>>> We had many patches submitted that are trying to improve
>> error handling.
>>> This is really great as many users ask exactly for that. So
>> it seems
>>> timely to establish some guidelines on how errors should be
>> handled,
>>> propagated, delivered, etc.
>>>
>>> I put together a proposal to start the discussion. Please,
>> review and
>>> comment. Once we have an agreement we would need to add the missing
>>> pieces to deploy it into Pig and then review the existing
>> patches to
>>> make sure they follow the proposed practice.
>>>
>>> http://wiki.apache.org/pig/PigDeveloperCookbook
>>>
>>> I have also started a general document called Pig Developer
>> Cookbook
>>> where we can keep track of development patterns we as a
>> community want
>>> to follow.
>>>
>>> Thanks again for everybody's contributions!
>>>
>>> Olga
>>>
>>>
>>
>>

--
Christopher Olston, Ph.D.
Sr. Research Scientist
Yahoo! Research



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message