Return-Path: Delivered-To: apmail-cocoon-users-archive@www.apache.org Received: (qmail 37007 invoked from network); 19 Feb 2007 07:45:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Feb 2007 07:45:10 -0000 Received: (qmail 35711 invoked by uid 500); 19 Feb 2007 07:45:11 -0000 Delivered-To: apmail-cocoon-users-archive@cocoon.apache.org Received: (qmail 35651 invoked by uid 500); 19 Feb 2007 07:45:11 -0000 Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: users@cocoon.apache.org List-Id: Delivered-To: mailing list users@cocoon.apache.org Received: (qmail 35640 invoked by uid 99); 19 Feb 2007 07:45:10 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Feb 2007 23:45:10 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [146.64.10.166] (HELO wabe.csir.co.za) (146.64.10.166) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Feb 2007 23:44:59 -0800 Received: from cs-emo.csir.co.za (cs-emo.csir.co.za [146.64.10.40]) by wabe.csir.co.za (8.13.8/8.13.8) with ESMTP id l1J7iMaI008665 for ; Mon, 19 Feb 2007 09:44:23 +0200 Received: from GW-EMO-MTA by cs-emo.csir.co.za with Novell_GroupWise; Mon, 19 Feb 2007 09:44:22 +0200 Message-Id: <45D97164020000D40000426F@cs-emo.csir.co.za> X-Mailer: Novell GroupWise Internet Agent 7.0.1 Date: Mon, 19 Feb 2007 09:44:04 +0200 From: "Derek Hohls" To: Subject: Re: HTML scraping with cocoon References: <17d2d1290702160248j5210f768y354cb9dcca3704c7@mail.gmail.com> <45D59002.9030205@gmail.com> In-Reply-To: <45D59002.9030205@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-CSIR-MailScanner-Information: Please contact sys-admin at csir dot co dot za for more information X-CSIR-MailScanner: Found to be clean X-MailScanner-From: dhohls@csir.co.za X-Virus-Checked: Checked by ClamAV on apache.org For Cocoon version 2.1.x, see: http://cocoon.apache.org/2.1/userdocs/html-generator.html >>> philguillard 2007/02/16 01:05 PM >>> Using HtmlGenerator, you get HTML pages transformed in XHTML, then you can manipulate them with XSL transformations. Phil Saliya Ekanayake wrote: > Hi, > > I'm trying to implement a HTML screen scraper and I heard that cocoon > can be > used for this. Can anybody point me to a good link regarding this? > > Thanks in advance > > > Saliya Ekanayake > -- This message is subject to the CSIR's copyright, terms and conditions and e-mail legal notice. Views expressed herein do not necessarily represent the views of the CSIR. CSIR E-mail Legal Notice http://mail.csir.co.za/CSIR_eMail_Legal_Notice.html CSIR Copyright, Terms and Conditions http://mail.csir.co.za/CSIR_Copyright.html For electronic copies of the CSIR Copyright, Terms and Conditions and the CSIR Legal Notice send a blank message with REQUEST LEGAL in the subject line to CallCentre@csir.co.za. This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org For additional commands, e-mail: users-help@cocoon.apache.org