incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From drew <d...@baseanswers.com>
Subject Re: Ditching our mirror system for an inferior solution? (was: Re: About Testing the SourceForce Mirror of AOO 3.4)
Date Sun, 15 Apr 2012 18:28:58 GMT
On Fri, 2012-04-13 at 23:15 +0200, Eberhard Moenkeberg wrote:
> Hi,
> 
> On Fri, 13 Apr 2012, Peter Pöml wrote:
> > Am 03.04.2012 um 18:17 schrieb Roberto Galoppini:
> 
> >> We at SourceForge have worked the last ten days to line-up dedicated
> >> infrastructure (including CDN services) to support the upcoming AOO
> >> download serving test.
> >
> > I can hardly believe reading this!
> 
> Me too. What an ignorance of proven and waiting mechanisms.

Hello Eberhard,

I see that your email account was not registered with the mailing list,
as it was moderated through, so will leave all the CC addresses in
place.

Additionally I will assume you have not followed the exchange which did
take place on this list since Peter's mail arrived. Though I believe a
reasonable decision has been reached already within the project during
the exchange I'll try to recap, without attempt to re-open the decision.
[of course the ml archive is publicly available, also]

First - thank you for your contributions, from myself and I'm sure other
individuals whom have benefited from the use of OpenOffice.org.

Secondly, your contributions have been, as you say, over a long period
of time and it is appropriate to view the decisions required with a long
term focus. No maxim of time is more true perhaps, then: the only
constant over time is change.

It appears from my perspective of one person watching the transformation
here that the folks in the foundation have certainly extended a hand
towards bringing the mirrorbrain system into the existing
infrastructure. This point I want to emphasize, the ASF is a very
capable, well established community and infrastructure. They did indeed
take on the responsibility for seeing that the functioning of the
project proceed, best as possible, going forward. There is no guarantees
in life of course but, given the track record your email highlights and
that of the ASF, there is every reason to believe 'forward' here will be
another 20 years.

It also seems to me that from the perspective of anyone wanting to
assist with the project going requires recognizing that in many basic
decisions a view towards openoffice alone is now too self centered.

Which is all a long way to get to the fact that IMO, after Peters email
from a few days back the group did stop and look, with with a direct eye
to the task at hand, now. The next release candidate is forming now.

To my view the most important email in this thread was Joe's last. I
hasten to add that there is a mirrors mail list which I'm not,
correctly, subscribed to. I do though monitor the general infrastructure
list in addition to the main project lists.

Joe (the ASF infrastructure team in general also) has asked a number of
times for contact and more direct engagement with regards to the
mirrorbrain admin functions. Sure seems that way, and I can't say that
I've seen much of that coming back.

The decision, and one needed to be made, includes that mirrorbrain will
be used for this coming release. I would suggest that the very first
task is to stop with that _and_ focus on making that happen with the
mirrorbrain network, in the context of the new reality that OpenOffice
is part of larger working community now.

As it stands the decision is also to end using mirrorbrain going forward
from there, and again I'd say that IMO from following the mailing lsits,
it seems like the appropriate decision at the moment.

I'd also add by way of hearkening back to my first point - the only
constant over time is change. mirrorbrain is not defined by openoffice
distribution alone, nor can anyone say how quickly this project will
move from 3.4 to 4.0, so who knows what time holds - but what is for
sure, the best thing for everyone concerned, both today and for
tomorrow, is for Peter and others I would hope, to engage directly with
Joe and the other ASF admins to work on the task at hand, now. 

Those are my thoughts on this of course at this point, I truly can't see
what benefit continuing this mail thread would be, but do hope Peter and
others are quick to be working with the admin's for the benefit of this
release.

Best wishes,

//drew

> 
> > What's going on? We have an existing (and well working) mirror network, 
> > that handles any required load just fine. It's proven and time-tested. 
> > It has survived all releases with ease. By all calculation, and by 
> > practical experience, the combined upload capacity of the mirrors is 
> > sufficient to satisfy the peak download demand as well as the sustained 
> > demand. By the way, the "peak download demand" doesn't really differ a 
> > lot from the day-to-day download demand, contrary to public belief. The 
> > mirrors are numerous and spread around the world, and the chance of a 
> > client being sent to a close and fast mirror is good - better than with 
> > a handful of mirrors as is the case with the Sourceforge mirror network. 
> > Sourceforge specializes in something different - providing a myriad of 
> > small files by a set of specialized mirrors. "Normal", plain simple 
> > mirrors can't take part in this network as far as I can tell.
> 
> Yes. I had tried to help with ftp5.gwdg.de - impossible "unconditionally".
> 
> > Even though the network was considerably extended a few years ago, from 
> > 10 (under 10?) to >20 mirrors, this is still a small number of mirrors. 
> > (Even though these are power-mirrors, but those are part of our existing 
> > mirror network just as well.)
> >
> > With our mirror network, mirrors can mirror partial content, so they can 
> > provide what's important in their region, like certain language packs 
> > only. This greatly increases the likelyhood of finding mirrors in remote 
> > areas, that don't have hundreds of gigabytes to spare. It's also 
> > unnecessary that mirrors carry old releases that are infrequently 
> > downloaded. Mirrors can run whatever HTTP software they prefer, not only 
> > Apache httpd, or even FTP servers. Mirrors can decide to offer mirroring 
> > only in their network/autonomous system/country to limit the share of 
> > requests they get, and from where they get it. Many mirrors don't have 
> > good international connectivity, but can be used well with us 
> > nevertheless. We provide cryptohashes, Metalinks, even P2P links, all 
> > fully automatically. That's very important for these unusually large 
> > files. Downloading without error correction is not fun. We select 
> > mirrors by GeoIP, but also by geographical distance as well as network 
> > topology, whatever gives a close match, and we already support IPv6.
> >
> > It has taken some years to build all this, and a lot of the features 
> > were triggered directly by the work on the OpenOffice.org redirector. 
> > Built for OpenOffice.org
> >
> > The software is the one kind of work that went into it, finding and 
> > collecting mirrors the other thing, building trust and lasting 
> > relationship. A mirror network isn't built overnight.
> >
> > I think there is a danger that the Apache mirror network is equated with 
> > the OOo mirror network. This is a mistake in my view. The large files 
> > that we have are a totally different challenge. It's a huge difference 
> > to download 6MB tarballs and 200MB files, both from the users 
> > perspective ("why does my file not work, that I waited so long for!?") 
> > and from the mirrors perspective ("what are these 200 connections from 
> > Chinese IPs on my mirror server!?"). It is important to be able to give 
> > mirrors different weight, because they differ vastly in their 
> > capabilities, which can range from 4GBit bandwidth down brittle to 
> > 50Mbit somewhere else. Even inside an "Internet country" like Germany 
> > you'll have differences of 100 MBit to multiple Gbit, and you want to 
> > utilize the bandwidth well. We have this working well!
> 
> I can confirm this, I have watched the growing "intelligence" of 
> MirrorBrain from the beginning.
> 
> > OpenOffice.org used a software called "Bouncer" before switching to 
> > MirrorBrain, which was one of the simpler solutions. I think everybody 
> > (who has been in the project a few years) will agree that we don't want 
> > to go back.
> 
> Surely. The OpenOffice step from bouncer to mirrorbrain was all over 
> agreed a performance and quality step.
> 
> BTW, dear Apache people, I am the one that helped StarOffice Hamburg to 
> publish their first opensource release - maintainer of ftp.gwdg.de since 
> 20 years.
> 
> > So I see that Sourceforge wants to beef up their network by renting a 
> > Content Delivery Network (CDN). Is that needed? yes, because they don't 
> > have enough bandwidth in mirrors. Is that a good idea? I don't think so, 
> > but I'm biased, because 1) I don't like advertisements and 2) I'm 
> > strongly rooted in the mirror community with both legs.
> 
> Didn't mirrorbrain lately help Novell to save a lot of money they 
> regularly had spended to Akamai before? I guess it was this way.
> 
> > In the mirror community, there is a kind of self esteem among the more 
> > ambitious mirror admins: they believe that stepping in of commercial 
> > CDNs is not needed to handle even peak download demand of the most 
> > popular Open Source software. And they work hard for it.
> 
> Yes, we do. All mirror admins love to see their lines full. That is the 
> temporary excitement we are struggling for. Mirrorbrain can give us this 
> picture at the spot moments without frustrating any single user.
> 
> > Together, we have proven that the help of commercial CDNs is *not* 
> > needed, both with OpenOffice.org and with OpenSUSE.org. Mirrors have 
> > served > 20 GByte per second together. The bandwidth is there! (In the 
> > past, Akamai was used during release peaks with OpenSUSE.org, so I have 
> > been there, and also got interesting insight and numbers there.)
> >
> > I tried the currently configured download from 
> > http://www.openoffice.org/download today (from a real crappy end user 
> > box ;). It was slow and didn't start downloading immediately, but showed 
> > a page full of advertisement that didn't have any relation to 
> > OpenOffice.org, wanted to open a popup (MS IE said that and blocked it)
> 
> Hey, Peter, you and MS IE - what's going on? Are you letting others to 
> drive you crazy?
> 
> > and when the download started, it came from the Swiss mirror, but I'm in 
> > Germany! What's that? Thrown 3 years back in time? Sub-optimal. (I can 
> > guess who pays for the CDN that is rented to help out: advertising.)
> >
> > Do you really want to ditch what we have built? Ditching the system that 
> > improved downloading OpenOffice.org in the farthest corners of the 
> > world? Exchanging it against a handful of Sourceforge mirrors, and 250 
> > Apache mirrors, many of which lack the capability? Some are big, but 
> > many will be far from having the bandwidth to deliver large files.
> >
> > Something that Apache's mirror system also can't do is sending me to my 
> > local mirror (my very ISP in my city runs a mirror, and my home IP is in 
> > their netblock). Apache mirror system sends me to *any* mirror in my 
> > country, while our current solution recognizes the network topology and 
> > lets me download from the local mirror. Especially with large files, 
> > that's very nice both for the ISP and for me as user. Sourceforge can 
> > theoretically do this (because they use a part of MirrorBrain for that 
> > purpose!) but don't have enough mirrors to play this out. This is not 
> > only useful with single ISPs, if they have a mirror; it's also useful 
> > with autonomous systems (AS) of networks that share a backbone, like 
> > most German universities in AS680 here in Germany.
> 
> The german university network (DFN-Verein, some members already are
> "producing" 10 gbit) was the base infrastucture for the openoffice 
> spreading (and staroffice before, and is now already with libreoffice 
> too).
> 
> Please don't neglect this chance for the Apache Foundation. It clearly is 
> offered (and - regarding ftp.gwdg.de and many more - since the beginning 
> of Apache practized).
> 
> > So we will have a *technically inferiour* solution in the future? That's 
> > not the Apache way, is it?
> >
> > I have been told more than once, on this list, that "it will be the 
> > Apache mirror system and nothing else". I didn't understand the reasons 
> > (except for policy, no special treatment for individual projects), but 
> > it won't work that way IMO.
> >
> > Now it seems to me that the Apache mirror system seeked the help of 
> > Sourceforge.net. If that means that some doubts crept up, then I share 
> > those doubts. But I don't see Sourceforge.net as the solution either, as 
> > explained above. They have their merits, and I like their dedication and 
> > the specialized system they've built (with features that I'm envious 
> > of!), but I think our existing solution is better suited. And not only 
> > that, IMO it is a very important prerequisite of being successful. No 
> > well-working downloads, no luck with distributing FOSS that consists of 
> > large files.
> 
> Dear Apache Foundation, please listen to Peter's words and use his work.
> It will be a win for you - incredible that you did not realize that 
> already by yourself. You are a "community product", and so you should help 
> to show that "the community" is autonomous.
> 
> 
> Viele Gruesse
> Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org)
> 



Mime
View raw message