hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <omal...@apache.org>
Subject Re: avro in mapreduce
Date Wed, 27 Jan 2010 22:12:17 GMT
On Jan 26, 2010, at 10:15 AM, Doug Cutting wrote:

> This is a key link in a series of issues involved in integrating  
> Avro in
> Mapreduce.

Getting Avro types passing through MapReduce is a good goal.

I apologize for not seeing the issue before it was committed. I accept  
some of the blame for that because I've buried in Hadoop emails. That  
said, it is important to realize that with changes that radically  
change the user's interaction with the framework require a lot of  
discussion. This jira, as you've admitted, had a very unrepresentative  
subject and description, which means that very few people were  
following it. Additionally, there has been no design document on the  
change to the MapReduce framework's paradigm, so it wasn't clear what  
you were doing until this patch was committed.  Such a large change  
should have been highlighted on the public dev lists. In the future, I  
would strongly suggest all developers planning on making massive  
incompatible changes to post a design document on the public dev lists  
outside of Jira to ensure the discussion happens before instead of  
after the patch has been committed.

In terms of reverting the patch, I had fundamental issues with the  
changes and felt that we needed more time to discuss them. Allowing  
the patch to stay in trunk would bake it further and further in and  
make reverting it much harder.

I've listed my issues on the jira, but at a high level my concerns are:

1. Changing API compatibility is very expensive.
2. Changing the semantics is even more expensive.
3. We are discussing several alternatives on the jira.

Unlike Python and Linux, Apache has a democratic process and we have  
to work together to build consensus. The Apache rules are that a  
single -1 from a committer blocks the change from being made.  
Occasionally that has cost us a lot of time. For example, a single -1  
from a committer on an implementation detail of the symlink patch  
blocked it for more than a year. We need to work together to find a  
solution that everyone can live with.

-- Owen

View raw message