Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 4370 invoked from network); 15 May 2004 18:58:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 15 May 2004 18:58:38 -0000 Received: (qmail 53434 invoked by uid 500); 15 May 2004 18:56:32 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 53361 invoked by uid 500); 15 May 2004 18:56:31 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 53182 invoked by uid 98); 15 May 2004 18:56:29 -0000 Received: from uv@upaya.co.uk by hermes.apache.org by uid 82 with qmail-scanner-1.20 (clamuko: 0.70. Clear:RC:0(66.111.4.26):. Processed in 0.088035 secs); 15 May 2004 18:56:29 -0000 X-Qmail-Scanner-Mail-From: uv@upaya.co.uk via hermes.apache.org X-Qmail-Scanner: 1.20 (Clear:RC:0(66.111.4.26):. Processed in 0.088035 secs) Received: from unknown (HELO out2.smtp.messagingengine.com) (66.111.4.26) by hermes.apache.org with SMTP; 15 May 2004 18:56:29 -0000 X-Sasl-enc: EDPjTi0yyBBMLVsNb86JVw 1084647379 Received: from upaya.co.uk (elfriedeholmes.demon.co.uk [80.177.165.206]) by www.fastmail.fm (Postfix) with ESMTP id 91CA0B84DF8 for ; Sat, 15 May 2004 14:56:19 -0400 (EDT) Message-ID: <40A667CD.8040103@upaya.co.uk> Date: Sat, 15 May 2004 19:56:13 +0100 From: Upayavira User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-us, en, de, ar-kw MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: Wiki conversion status References: <40A4D85C.2040505@upaya.co.uk> <1084584609.1304.2032.camel@ighp> <40A5CD7B.60804@upaya.co.uk> <1084608461.1306.2188.camel@ighp> In-Reply-To: <1084608461.1306.2188.camel@ighp> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: hermes.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N David Crossley wrote: >Upayavira wrote: > > >>David Crossley wrote: >> >> >> >>>Links inside headings are not handled >>>e.g. http://wiki.apache.org/cocoon/BlockDescriptions >>>should have a local link in the heading which goes to >>>our Batik wiki page. >>> >>>old: http://wiki.cocoondev.org/Wiki.jsp?page=BlockDescriptions >>> >>>I presume that would be a minor problem. People could add >>>them post-conversion. >>> >>> >>This is something Moin can't do. This is something I think we'll just >>have to do manually post (or pre-) conversion. Just move the link out of >>the heading. >> >> > >Okay, post-conversion. We can add them back if really needed. > > > >>David Crossley wrote: >> >> >> >>>The process of running the conversion script is an >>>excellent opportunity to automatically catch some spam >>>that has crept in. >>> >>>There is no doubt that we have missed some vandalism >>>cases. We are only a few humans trying to manually catch it. >>>Also remember the problem with the diff notification that >>>only runs every hour and we only get the most recent change. >>> >>>Is it possible to generate a list of vandalised pages? >>>For example one pattern is "emmss.com". >>> >>>On the other hand, we could probably run some 'find | grep' >>>commands on the server-side after the conversion. >>> >>> >>If we can come up with simple rules as to how to implement this, then >>yes, but I'd rather just get the conversion done. >> >> > >Definitely, just get the conversion done. We can fix those afterwards. > > > >>I've got an exclusions >>file which says which pages to exclude from conversion. I can add files >>to that. >> >>Otherwise, I think a manual grep for http:// would probably be a good >>idea, and then edit the links out via the gui. >> >> > >Where does the content end up on the apache server? > /www/wiki.apache.org/data/cocoon/data/text is where the actual pages are. >Those of us with commit access can ssh in and start building >some tools to find the vandalism. > > Or just use grep. It is just a directory full of text files. Upayavira