Mailing-List: contact dev-help@directory.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Apache Directory Developers List" <dev@directory.apache.org>
Received-SPF: pass (athena.apache.org: domain of elecharny@gmail.com
 designates 64.233.182.190 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=beta;
        h=message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:content-type:content-transfer-encoding;
        b=lr8kFCgSGkRlgS+Zvr8J51oooDgyfyggPnjzaTNRgePgCKWxmoFAljFjCHIpRiHM464z6iej44ekois1ryWxOBguu554EcKuZJ3kTeQ36JhqBBesK+9gc7O4PfFrxdQ4aQVmt5uwvN2qacvJMhc2TnT8uzkC8wEFT911tdqesDM=
Message-ID: <47E02B9E.4040701@gmail.com>
Date: Tue, 18 Mar 2008 21:52:46 +0100
From: Emmanuel Lecharny <elecharny@gmail.com>
User-Agent: Thunderbird 1.5.0.14ubu (X11/20080306)
MIME-Version: 1.0
To: Apache Directory Developers List <dev@directory.apache.org>
Subject: Re: Committing to Subversion (Re: Some questions about CI)
References: <a32f6b020803181115p2a1b1c80t4d753d6d3c692f4@mail.gmail.com>
 <a32f6b020803181245v67d120b3yb6bfe34385a89702@mail.gmail.com>
In-Reply-To: <a32f6b020803181245v67d120b3yb6bfe34385a89702@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit

<snip/>
> <seriously>
> BTW this was just for laughs on an otherwise boring topic.
> </seriously>
Sure ! This is the way I understood it :)
> <snip/>
> Before going into your points below I want to state the personal rules 
> I try to operate by on commits. 
>
> (1) Always try not to break the build and tests (unit and integration) 
> on commits to public branches and trunks - if I cannot do this I 
> create a temporary private branch so my tinkering does not cost others 
> pain and grief.
>
> (2) Try to commit coherent chunks of work in a single transaction 
> which are easy to track.  This makes it so I can add a one liner 
> summarizing the general gist of the change.  For example it may be 
> something like "Adding support for varargs in Entry API ...".  Then I 
> can have some bullet points describing the specifics of the changes if 
> they are noteworthy.
>
> (3) Avoid mixing bogus formatting/refactoring changes with bug fixes 
> or feature work.  When refactoring changes like class renaming or 
> formatting I make sure this is in a separate commit so it does not 
> obscure the diff of these other kinds of changes.
>
> (4) Don't pick at unrelated issues.  Focus on the objective.  Make the 
> changes and commit with a coherent focus message and diff so the 
> change is easy to understand.  I can relax later and pick at things 
> like changing the code to support generics or something I saw that was 
> not right but unrelated.  Use JIRA to note things or a notepad and get 
> back to them after you have reached your objective and committed the code.
>
> The overall goal with these rules is to have a searchable, easy to 
> understand version history without noise to obscure the picture.  
> Apache projects in general promote this with unofficial rules like 
> don't use tabs but use spaces - this is so whitespace variance to does 
> not produce bogus chunks in diffs.   This is just one example.  Noise 
> or chatter should be reduced because there's a lot of information to 
> be filtered.
> </seriously>.
Let me give you some enlightments about the way I work, which is somehow 
different, but not that much.

(1) Totally agree. I'm not creating enough branch, and this is a bad habit.

(2) I usually prefer spliting the commits in smaller chunks, because 
it's sometime difficult to check what has been changed when you are 
committing more than a couple of files. If you are using eclipse, as 
soon as you have opened the 'commit' popup, you can't go back to the 
code and look at the differences. So it's more confortable to do a 
comparison, and then a simple commit. Let's say it's a IDE driven 
technic, and if the popup was not modal, I may gather more files in one 
commit. But I also try to avoid single files commits when I have some 
related files to be committed.

(3) That is much more questionable, IMHO. For some reason, I found that 
when you walk the code, and you find a bad javadoc, a bug, a typo, then 
you should fix it, otherwise you may never come back and fix it. In 
fact, this is always the case : you never come back and clean the code...

(4) Pretty raisonnable. Every time I tried to chase two targets, I 
rarely caught one of them, not to mention I caught both :)

Basically, I don't think we are far from working in a different way. I 
can deal with your (2). It's just a matter of opening a text editor, 
which is something I'm doing more and more now, trying to follow your 
best practices. (just because they are, well, better than mine :)

That being said, let's continue the CI discussion.
>
>
>     On Tue, Mar 18, 2008 at 12:41 PM, Emmanuel Lecharny
>     <elecharny@gmail.com <mailto:elecharny@gmail.com>> wrote:
>
>         Alex Karasulu wrote:
>         >
>         >
>         >     >  - - How often shall a build be done (compile/test,
>         sitegeneration)
>         >
>         >     We have many possible options. We tried something like
>         kicking CI
>         >     after each commit, but it leads to issues (usually, we
>         don't commit
>         >     code in one big shot,
>         >
>         >
>         > Yes this does happen but it's bad practice on our part.
>         I don't think so. First because when you commit, you usually have
>         already checked locally that the server is ok (_usually_ =>
>         sometime,
>         this is not the case ;). 
>
>
> We can't trust that.  This is the reason in the first place for 
> setting up a CI server.
I won't say that. I would rather say that we trust it, but we know that 
we will have some bad commits from time to time, and the CI will catch 
those bad commits.

<WARNING>I'm *not* saying that you don't understand what is a CI system 
in the following paragraph</WARNING>
For me, there is a big misconception, and also a really bad 
understanding of what a CI system is, and I saw that on many projects I 
worked on, even before working on ADS. When a CI is installed, after a 
few weeks, dev just become lazzy, commits the code when leaving the 
office/going to bed, thinking : "F*ck the potential breakage, I will 
look at it tomorrow morning, anyway the CI will inform me ...". A CI 
should just be the latest safety net you have.

<personal vision>
I think that CI is useless. You don't have to share this vision, but I 
never saw an environment where a CI was enforcing the code quality and 
the commit quality. Except when working on a branch, it's just a waste 
of time and energy. I even saw a CI being installed as a way to protect 
the integration team against junk code being pushed by a dev team which 
didn't took enough time to run integration tests, just becaus, eh, the 
integ team is just here for that purpose, nah ?

So what a CI is good for ? Well, when working in a branch, it can be 
usefull, as you can commit blindly, and check the day after if you broke 
something. It saves time, as you don't have to run integ tests.

What else ? When you have big and costly integration tests, then, yes, a 
CI is mandatory.
</personal vision>

 
>
> Although I will entertain you by actually addressing your points, I 
> really don't have to since you've repeatedly broken the damn build 
> more than all of us combined.
That hurts ;)  But this is so damn true ... I hope the situation is 
better now, as I'm trying to discipline myslef, and also thanks to the 
good work you did with the 'reverter' ! A fast integration test is 
probably the most important factor in better commits. Now that it only 
takes 8 minutes to run them, instead of 25', I'm more serious about 
running them before every commits...

May be my commit rate is also a factor ;) (but certainly not an excuse !!!)

> Problem is, you write some good code so we let you commit anyway.  
> Secretly we all talk behind your back.  "Oh man Emmanuel broke the 
> damn build again.  When is he going to start committing things 
> properly.  I guess we just have to live with it since he does so much 
> for the project, makes the server fast and buys everyone beers at 
> ApacheCon."
 Ahhh !!! Yes, I heard the 'whispers' while I'm quietly sleeping while 
your are trying to fix what I have broke :)
>
> <seriously>
> Another reason why I take the time is because you live with my faults 
> too which are much worse and I am thankful for your patience.  I owe 
> you that as this is a meritocracy as well as many other things.  This 
> is the great thing about this project.
> </seriously>
<seriously too>
You are right, breaking the server is bad. Everytime I did it, I felt 
guilty.
</seriously too>

<not seriously>
And now the problem is that I barely can blame Maven, it's so much 
better than it was ! So it's all my fault ...
<not seriously/>

>  
>
>         Second, because doing more than one commit
>         allows you to comment more precisely what kind of modification
>         you have
>         done.
>
>
> Emmanuel your comments clutter the repository as they do the mailing 
> lists with sometimes unnecessary information in a volume that is 
> overwhelming.  Besides sounding like someone with peanut butter stuck 
> to the roof of your mouth, you're constantly pointing out little 
> details that don't matter that much.  Either that or you're off topic 
> in la la land.
I won't worry to much about the volume :) Last year, I committed around 
1200 times, and we are now at revision 566 000... Ok, I may commit less 
small modifications... I'm trying :)
>
> Ok ok let me be a bit more serious ... I'll stop trying to make you 
> laugh for a minute.
np !
>
> We don't want log messages cluttered in the svn commit log with things 
> like,
>
> r1 "Adding generics to Comparator arguments"
> r2 "Correct spelling"
> r3 "Added Javadocs to methods"
> r4 "Renamed some variables for clarity"
> r5 "Added some defensive checks for null arguments"
>
> Instead you can have this in one commit like so:
>
> r1 "Cleaning up various jdbm-store module ...
>  
>    o Adding generics to private method arguments
>    o Correct spelling because Alex can't spell
>    o Added Javadocs to methods
>    o Renamed some private variables for clarity
>    o Added some defensive checks for null arguments "
Well, I must say I agree. I'm trying to gather the comment more and more 
now. Just because it's almost useless to people, and also because when 
you have to track a commit in the middle of a forest of useless commits ...
>
> Now is this mandatory? Nope not according to rule #1 which is most 
> important.  Just don't break the build and sometimes when you have to 
> you can commit these kinds of changes one by one.  This clumping of 
> minor details into one commit just keeps the clutter down and makes 
> searching the repository logs much easier to do.  Sometimes you'll 
> just have no choice to do these independently and that's ok but you 
> should have the intention of keeping the noise down.
Fair enough.
>
>
>         I personnally don't really like to commit a big shot of code,
>         unless it
>         is really closely connected. 
>
>
> Right understood.  Usually I like to try to have a clear concise 
> picture of what I will do for a specific commit to restrict it to 
> closely related changes but this is not always possible. 
>
> Just think how much more useful our logs can be to us when we're 
> hunting down historical changes if the changes are more coherent 
> instead of mixed, unrelated, and strewn with bogus non-critical logs.
I would ponder my previous comments, which was not really reflecting 
what I wanted to say : I like to commits related things together, to a 
certain extent, but sometime, I also like to spilt the commit in more 
than one piece. This is typically the case when committing in shared and 
in apacheds.


>  
>
>         But as I also fix some javadoc, bugs,
>         warnings while browsing the code, I like to commit in smaller
>         blocks.
>
>
> Yeah I understand and you can do that.  Just add the javadocs and make 
> the fix while working in the same regions sure.  But you don't have to 
> commit that separately: most likely no one knows about the bug fix you 
> just made unless they've noticed it.  Put the bug fix in a bullet 
> point on the overall commit like, "found nasty bug in filter parsing 
> and fixed". Another thing is you loose focus and pick at some things 
> as if you had the same luxury when you pick at your arse.  Often these 
> things are unrelated.  Like you start fixing something then start 
> doing generics all over the place and commit something that obscures 
> your main intentions.  You gotta stop drinking so much buddy.
:)

I have also slightly changed my commit habits lately. I now try to 
gather the related modifications in a single commits, and avoid single 
commits as much as I can.

<joking>
My ohloh commit-meter is now high enough ;)
</joking>
>
> <seriously>
> The collateral changes you make are very valuable and I appreciate 
> your taking the time and effort to make them.  This is all good in the 
> sense that it makes the code more readable so please note that I'm not 
> talking about absolute rules here.  These are guidelines to follow 
> that lead to overall better conditions on a big project.  There's a 
> lot of code and information here so let's try to make that information 
> easy to filter.
> </seriously>

It takes time to be infusate with good practice. Last year, I was 
committing like crazy, one change = one commit. And now, I find this 
futile, and almost useless. This was at a time I was 'killing' warnings, 
and it was not really smart.

I found it more usefull now to limit the number of commits and to add 
more precise comments. I think this is reflected in some the latest 
commits I did :

URL: http://svn.apache.org/viewvc?rev=637318&view=rev
Log:
o Added the DefaultServerAttribute class missing methods and tests
o Lot of tests added
o Removal of the useless AbstractServerAttribute class


URL: http://svn.apache.org/viewvc?rev=637317&view=rev
Log:
o Addition of the ClientAttribute interface
o Addition of the DefaultClientAttribute implementation
o Lot of tests added for this class
o Removal of the useless AbstractClientAttrbute class, as the inheritence scheme has been heavily reworked and simplified
o Other minor related modification


>  
>
>
>         > I prefer a build on each commit so it's easier to catch the
>         offending
>         > commit and isolate it to a user who can be informed
>         immediately while
>         > they still have a mental stack in memory.
>         If you kick a build after each commit, you may have many
>         builds kicked
>         when a lot of commits are done. 
>
>
> Well if you are committing like a super sonic hedge hog then yeah.  
> That's the whole point of my email here.  Let's not commit every time 
> we dot our i's or cross our t's. 
ok got the message :)
>
> I know this is all about your "oh lo" ratings though :-). 
Shit... discovered :)
> Which is cool so I might change my mind on this whole topic and start 
> committing like a lunatic.  Oh lo is showing you out committing me.  I 
> think I'll setup a cron job to just format, unformat and commit some 
> random file just to get back on top.
ah ah ah !!! I wish I'm better at Perl than I'm ...
>  
>
>         I also think that it's quite rare that a
>         commit break the build (it happens, say, every sic months ...), 
>
>
> Ha ha ha you did not just set me up for this did you? Man if you're 
> Emmanuel the build breaks when you walk by the computer.  That's why 
> we want to keep the build server here in the US far far way from your 
> desk.

Whatever, I do have root access to them ;)

>  
>
>         and when
>         it does, being able to point the offending commit does not
>         really helps
>         to fix the breakage, because the offender is generally already
>         sleeping :)
>
>
> People will break the build at different times.  Usually when tired 
> and sleepy is the case and probably occurs more often so you may have 
> a point there.  But you're dead wrong about this not being useful.  
> Just having the shame that comes with breaking the build is a good 
> thing.  It's what brings the arrogance and sense of can't do no wrong 
> of all of us down.
100% agree
>
> When you slip the CI server tells you you f**ked up.  A slap in the 
> face from a machine instead of me having to feel bad about telling you 
> that you screwed up again.  So if the CI works at the end of the day 
> it cannot catch those accountable since several errors could be mixed 
> together from different offenders.  And it's their responsibility to 
> get their asses up to fix the problem.
Here, I disagree. Having you ( or any other committer) complaining about 
a breakage is *way* stronger than if the server complains... It makes 
you more carefull (see one of my point up)
>
> Plus how do we know which change broke the build.
As you don't use kick a build on CI every 5 minutes, the preoblem remains...
>
> Remember we all live in the same code base and can give each other a 
> hard time by how we conduct ourselves.  These reminders are good when 
> we all slip - I personally want immediate notification.  I want to 
> know when I screw up and am ok if others see it.  I want the pressure 
> to force me to change.
I would favor a local CI then. I personally thinking about installing 
one on my laptop, just as a safeguard.
>
> Good engineers architect around their own faults.
>
> We need alarms to tell us we're messing up some how.  Why? Because we 
> all do no matter who we are.  I like committers who think they will 
> f**k up better than those that think they will not because those folks 
> are more honest and they actually double check there work. That's all 
> I could ask for.
I buy the idea of alarms, but they should remains alarms. Not a way to 
augment the risk you can take.
>  
>
>
>         >
>         > I personally would like to know immediately when I goofed
>         something
>         > while that something is still in my head.
>         Well, run the tests before committing should be enough, isn't it ?
>
>
> I do but I screw up at times.  Do you run all the tests every time you 
> commit?  We need a safety net.
Almost every time, but when it's obviously a begnign commit (like fixing 
a javadoc). Sadly, the evil is in details :)
>  
>
>
>
>         Don't get me wrong : I don't say that we should fragment
>         commits as much
>         as possible, nor I say that knowing which commit has broke a
>         build is
>         useless, I just say that a CI should be an airbag, when the
>         integration
>         is the safety belt. 
>
>
> I like that comment a lot!
>  
>
>         never commit without fasten your safety belt
>         (-Dintegration test), and in case you crash the server, the
>         the airbac
>         (CI) may save your life !
>
>
> Just perfect.  We're not so much off base.  OK I'm going to leave me 
> nasty comments above even after reading this because he he I took the 
> time trying to be funny. 
I think we are sharing the same concerns. I'm mostly with you 99% on 
what you wrote.

This is however an interesting convo, because it helps to understand the 
rational behind the rules and guidelines we are using, and also because 
it's like a vaccine : every year, you need a fresh injection !

Thanks Alex !

-- 
--
cordialement, regards,
Emmanuel L�charny
www.iktek.com
directory.apache.org