hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: getting started building Mavenized hadoop common
Date Thu, 04 Aug 2011 11:38:34 GMT
On 03/08/11 02:41, Ted Dunning wrote:
> (the following discusses religious practices ... please don't break into
> flames)
> In the past, the simplest approach I have seen for dealing with this is to
> simply put the generated code under the normal source dir and check it in.
>   This is particularly handy with Thrift since it is common for users of the
> code to not have a working version of the Thrift compiler.  I then have an
> optional profile that does the code generation.  In my cases, I made that
> profile conditional on a thrift compiler being found, but there are other
> reasonable strategies.  I did the code generation by generating into a temp
> dir and then copying the code into the source tree so that if the generation
> failed, no code was changed.
> The nice side effect is that IDE's see the generated code as first class
> code.
> Many consider various aspects of this style to be bad practice.  Some
> condemn checking in generated code as akin to checking in jars.   I kind of
> agree, but lack of thrift or javacc is common enough that it really has to
> be dealt with by checking these in somewhere.  Only if your code generator
> really is ubiquitous is it feasible not to check in generated code.

The problem with this approach is that SVN will often say "it's changed" 
when it hasn't. You can do some tricks with Ant using the <copy> 
operation and only copy if they really are different, though once the 
generator adds a timestamp to the header you are in trouble, and you 
have to look at the diffs to see if anything really has changed. I've 
had this problem in the past with Hibernate generated stuff.

> Others consider the commingling of generated an "real" code in the same
> directory tree to be a mortal sin.  I agree, but in a lesser form.  I
> strongly condemn the use of a single directory for generated and
> non-generated code, but if all directories avoid such miscegenation, then I
> don't see this as much of a problem.  Most people recognize that a package
> with a name "generated" will contain generated code.

I'd prefer to generate the stuff in the same tree, in a subdir, with 
.svnignore set up to never commit the source. That way it's all in the 
same tree, but you can't check it in. This keeps the source there even 
when you rm -rf build, but keep it out of SCM

View raw message