incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shane Curcuru <...@shanecurcuru.org>
Subject Re: [www] html instead of markdown (mdtext)?
Date Fri, 12 Aug 2011 16:50:30 GMT
Hey, my only role with Apache infrastructure is as court jester, so I'm 
not the one to help here: folks will need to organize specific proposals 
and work with the Apache infra team.

Note this is *not* about which is the fastest way to serve content.  The 
real question is how the Apache infra team - working within a non-profit 
org, with limited physical and admin resources - can effectively and 
reliably manage all services our many projects ask for.

Information about Apache infrastructure is available:

   https://www.apache.org/dev/machines.html
   https://people.apache.org/~henkp/  (lots of links)
   https://people.apache.org/~vgritsenko/stats/daily.html
   https://www.apache.org/mirrors/

- Shane

P.S. the CMS by default stores both the mdtext and the final html output 
both in SVN.

The mdtext is in your working SVN area, with diffs to the appropriate 
project lists, so that the team can see what changes are being made in 
terms of the content people develop.

The final html is in the staging/production website SVN areas, mainly 
for backup recovery and security.  If the server's lost, just install 
httpd somewhere, and then svn checkout the html tree.

On 8/12/2011 12:19 PM, Terry Ellison wrote:
> Rob,
>
> I support your general point. Using static HTML files to achieve might
> have been a sound argument in the 1990s, but it isn't really credible
> with today platform technologies. What are the transactional rates for
> the Apache site? How many requests per second even just roughly?
>
> Taking your example of the MediaWiki engine, this is scaled to meet the
> transactional and data volume demand of wikipedia.org, one of the
> busiest websites on the planet. (There are typically ~100 updates per
> second and goodness know how many pageviews.) See
> http://www.mediawiki.org/wiki/Cache and the few dozen subsidiary pages.
> There are many high performance caching products that address this issue
> -- Apache even does one: http://trafficserver.apache.org/ -- and the
> mediaWiki engine already integrates with a couple of the leaders: Squid
> and Varnish.
>
> Apache's "heartland" is its "number one HTTP server on the Internet".
> Are we rally saying that the best way to manage content is through
> static HTML files? This is just daft IHMO. Has anyone ever heard of
> current CMS technology.
>
> How many content editors and contributors can read HTML these days?
>
> One other point: yes SVN or any equivalent versioning repository can
> store most types of content, but versioning should take place at the
> highest level of abstraction and language that the content providers
> work in. Take an extreme example to emphasise this point. svn can store
> object modules, but does this mean that we should use these are the
> master control and disassemble back to assembly code to update programs.
> Of course not. But to many editors, HTML is little more that binary
> machine code.
>
> Non-functional (infrastructure) requirements help drive the design and
> implementation cycles but they shouldn't unnecessarily limit the true
> functional requirements of the system. To do so is madness. Is is really
> an approach Apache wants to advocate?
>
> Regards
> Terry
>> On Fri, Aug 12, 2011 at 10:10 AM, Shane Curcuru<asf@shanecurcuru.org>
>> wrote:
>>> (To provide a little context while Gav may be asleep)
>>>
>>> On 8/12/2011 9:26 AM, Rob Weir wrote:
>>>> On Fri, Aug 12, 2011 at 3:41 AM, Gavin McDonald<gavin@16degrees.com.au>
>>>> wrote:
>>>>>> On Thu, Aug 11, 2011 at 12:12 PM, Kay Schenk<kay.schenk@gmail.com>
>>> ...snip snip snip...
>>>
>>>>>> Just a thought: Could you do the entire website in MediaWiki, with
>>>>>> only
>>>>>> exception cases (download page, etc.) done in HTML?
>>>>> Just to put a blocker on this right away, we will not be using the
>>>>> wiki
>>>>> as the
>>>>> main website or the main entrance into the OOo world.
>>>>>
>>>> Since it is not self-evident to me why a wiki would be a problem for
>>>> the main website, could you explain this a little further? Is there a
>>>> technical problem? Remember, the wiki already comprises several
>>>> thousand pages of website content, so in a very real sense the "main"
>>>> website is already the wiki.
>>> Performance. As I understand it, the bulk of all apache.org content is
>>> served statically as html files. Putting a major project's homepage
>>> website
>>> like the future office.a.o (or whatever name) up as a wiki would add a
>>> significant amount of load to our servers, even for a highly
>>> efficient wiki
>>> engine.
>>>
>> Thanks, that gives some context. So "main" in this case is not
>> necessarily only the top level page, i.e., an eventually
>> openoffice.apache.org or the current www.openoffice.org. Certainly
>> those pages would be some of the most highly-trafficked pages. But we
>> probably have some others that are also, FAQ's, Release notes,
>> download page, etc.
>>
>> But that still leaves the long tail of the thousands of other pages
>> that are individually accessed rarely, but may add up to significant
>> load.
>>
>> I'm surprised there is no caching mechanism for MediaWiki to simply
>> write out up static versions of pages and then invalidate the cache
>> for a particular page when it is changed. In theory you could have
>> the rarely-changed pages be just as efficient as static HTML. Plugins
>> exist that do this for WordPress, for example.
>>
>>
>>> The beauty of the CMS is that while it's easy to work on the pages
>>> (either
>>> via SVN or browser), the final result is simply checked into SVN and
>>> then
>>> the resulting .html file is just stuck on the production webserver site.
>>> Some projects use a wiki to manage their homepages (i.e. project.a.o,
>>> separate from any community wiki they may have), but the physical
>>> homepage
>>> that end-users see is typically static html that's been exported from
>>> their
>>> wiki site.
>>>
>>> Gav or infra folk can provide more details, but you should plan on
>>> adhering
>>> to whatever performance restrictions the infra team requires for the
>>> main
>>> website.
>>>
>>> - Shane
>>>
>

Mime
View raw message