commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Kitching <skitch...@apache.org>
Subject Re: [ANNOUNCEMENT][Betwixt] Betwixt Release 0.6.1 RC3 Available
Date Thu, 09 Jun 2005 22:33:06 GMT
On Thu, 2005-06-09 at 20:29 +0100, robert burrell donkin wrote:
> On Thu, 2005-06-09 at 10:41 +1200, Simon Kitching wrote:
> > On Wed, 2005-06-08 at 22:21 +0100, robert burrell donkin wrote:
> > 
> > > > ===
> > > > Maven reports:
> > > > 
> > > > I would suggest disabling this report. Firstly, a log of the last 30
> > > > days isn't of much use. And secondly, due to the import into SVN of
> > > > back-dated CVS changes, date-based selection on the apache subversion
> > > > repository is broken, so the report is not just useless but actively
> > > > WRONG.
> > > > 
> > > > I suggest that "Developer Activity" and "File Activity" reports are also
> > > > useless, and (if based on SVN date selection) also wrong.
> > > 
> > > AIUI the problem occurs only with the dates on the imported data. new
> > > data is fine. i've checked the results and they look about right. i do
> > > agree that they aren't all that useful but i know some users like them
> > > so i'm inclined to keep them...
> > 
> > Unfortunately the date problem is repository-wide, and long-lasting.
> > 
> > When subversion is passed a date, it immediately converts this into a
> > revision-number. And it does this by performing a binary search on its
> > revisions.
> > 
> > Assuming there are 1000 revisions currently in the repository, it
> >  * checks whether revision 500 is earlier or later than the desired date
> >  * 500 is earlier, so revision 750 is checked
> >  * 750 is earlier, so 875 is checked
> >  * 875 is later, so 812 is checked, etc
> > 
> > This process is based on a fundamental assumption that revision X has an
> > earlier date than revision X+1. When this assumption is broken, the
> > binary search can go off in the wrong direction. And it looks to me like
> > after a "problem" import date selections will continue to be broken
> > until the revision# has at least doubled in size.
> > 
> > Whether you actually get bitten by the problem for a particular
> > date-based select is a bit of a lottery; if the binary search happens to
> > hit "valid" nodes all the way down, the search will work correctly. But
> > hit the wrong node and the select can be a long way out.
> > 
> > Alas, it is just not possible to "insert" revisions into an existing
> > repository; when importing data it can only be added as new revisions.
> > 
> > So the current choice when importing CVS history is 
> > (1) stuff up all date-based selections *repository-wide*, or
> > (2) discard all date information associated with imported CVS history,
> >     and put "current" dates against the revisions.
> 
> thanks for the detailed information
> 
> once all the CVS repositories have been imported, this should no longer
> be a problem but until then i'll disable the reports.

Alas, I don't think so. As I said above, I believe that after the last
CVS import has been done, we will then need to wait for the # of
revisions in the repository to double before date-based searching is
completely reliable again.

I expect that the revision# will reach about 200,000 by the time
everything is imported into SVN. So when the count reaches 400,000
everything will be right again. That should only take a decade or so :-(

Regards,

Simon



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message