hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Milind.Bhandar...@emc.com>
Subject Re: getting started building Mavenized hadoop common
Date Wed, 03 Aug 2011 22:33:57 GMT
I had this exact same argument with Arun Murthy at the hadoop dev meeting
about checking in protobuf generated code in the MR-279 branch. That same
evening, he had to download entire xcode and install it on his new Mac to
build the MR-279 branch :-)

Hadoop recordio also follows the same approach. It checks in the generated
code, and if javacc is found (I.e. Javacc.home != ""), it generates parser.

Arun's concern was that someone might accidentally check in changes to
generated code (because they used a different version of protobuf e.g.)
But isn't there some way to flag these changes ?

- Milind

Milind Bhandarkar
Greenplum Labs, EMC
((Disclaimer: Opinions expressed in this email are those of the author,
and do
not necessarily represent the views of any organization, past or present,
the author might be affiliated with.)

On 8/2/11 6:41 PM, "Ted Dunning" <tdunning@maprtech.com> wrote:

>(the following discusses religious practices ... please don't break into
>In the past, the simplest approach I have seen for dealing with this is to
>simply put the generated code under the normal source dir and check it in.
> This is particularly handy with Thrift since it is common for users of
>code to not have a working version of the Thrift compiler.  I then have an
>optional profile that does the code generation.  In my cases, I made that
>profile conditional on a thrift compiler being found, but there are other
>reasonable strategies.  I did the code generation by generating into a
>dir and then copying the code into the source tree so that if the
>failed, no code was changed.
>The nice side effect is that IDE's see the generated code as first class
>Many consider various aspects of this style to be bad practice.  Some
>condemn checking in generated code as akin to checking in jars.   I kind
>agree, but lack of thrift or javacc is common enough that it really has to
>be dealt with by checking these in somewhere.  Only if your code generator
>really is ubiquitous is it feasible not to check in generated code.
>Others consider the commingling of generated an "real" code in the same
>directory tree to be a mortal sin.  I agree, but in a lesser form.  I
>strongly condemn the use of a single directory for generated and
>non-generated code, but if all directories avoid such miscegenation, then
>don't see this as much of a problem.  Most people recognize that a package
>with a name "generated" will contain generated code.
>On Tue, Aug 2, 2011 at 5:44 PM, Tom White <tom@cloudera.com> wrote:
>> > I like to debug through the code :)  It would be nice if there were an
>> > automated way to handle that folder, but in the meantime, it would
>> probably
>> > be useful to document that along with the eclipse instructions.
>> I had to do this step too. I've added it to the instructions on
>> http://wiki.apache.org/hadoop/EclipseEnvironment, but I agree it would
>> be nice to automate this if anyone knows the relevant setting.

View raw message