Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 25588 invoked from network); 9 Sep 2004 08:30:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 9 Sep 2004 08:30:40 -0000 Received: (qmail 74480 invoked by uid 500); 9 Sep 2004 08:30:33 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 74395 invoked by uid 500); 9 Sep 2004 08:30:32 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 74374 invoked by uid 99); 9 Sep 2004 08:30:31 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from [66.111.4.31] (HELO frontend2.messagingengine.com) (66.111.4.31) by apache.org (qpsmtpd/0.28) with ESMTP; Thu, 09 Sep 2004 01:30:29 -0700 X-Sasl-enc: VkWqECEowTiIyyT7CYFYPw 1094718626 Received: from [192.168.1.74] (unknown [213.48.13.39]) by www.fastmail.fm (Postfix) with ESMTP id AE6D24E7CFA for ; Thu, 9 Sep 2004 04:30:26 -0400 (EDT) Message-ID: <414014B9.4060608@upaya.co.uk> Date: Thu, 09 Sep 2004 09:30:49 +0100 From: Upayavira User-Agent: Mozilla Thunderbird 0.7 (Windows/20040616) X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: Custom extensions - to be made available if possible References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Antonio Fiol Bonn�n wrote: >Hello, > >We have started developing two extensions for cocoon, and we would >like to know if the core team would be interested in getting them into >the trunk, and optionally in maintaining them in the future. > >The extensions are: > >- A transformer that connects via HTTP POST and sends its XML input to >the server, and returns the XML returned from the server to the >pipeline. > >This is similar to the SOAP thing, but without the envelope, and with >a predefined (configured in the sitemap) URL. > > >- An extension to the Cocoon Lucene searching system (or something >different, yet pending design), so that non-XML content can also be >indexed. In particular, we are interested on PDF, but we are designing >it as generic as possible. > >BTW, your opinion may be very valueble for the design. Let me explain >the two approaches we have thought of: > >a) Refactoring SimpleLuceneXMLIndexerImpl so that its private method >indexDocument is not private, and taking it to an external component. > >b) Creating a PDFGenerator (in the cocoon sense of generator, of course). > >Option (a) seems to be giving us more headaches than pleasure, and >option (b) seems cleaner to a certain point. Option (b) would allow to >follow links in the PDF file, if developed to that point. > >However, option (b) implies choosing a format for its output (which?), >and also poses some problems wrt. the sitemap. Until now, we have a >pipeline using a reader to read pdf files (static, from disk). And we >would need a generator to be invoked instead for the content and links >views. How can we do that? Maybe with a selector? But that does not >seem very clean. Any hints there? > >Any other options? > >Any general comments? > > How are your PDFs generated? Are they generated by Cocoon? If so, you should index the raw data, before you serialize to PDF. Just a thought. Upayavira