Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 69075 invoked by uid 500); 13 Jan 2003 15:03:21 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 68963 invoked from network); 13 Jan 2003 15:03:07 -0000 X-Authentication-Warning: vern.chem.tu-berlin.de: stephan owned process doing -bs Date: Mon, 13 Jan 2003 16:02:31 +0100 (CET) From: Stephan Michels X-X-Sender: stephan@vern.chem.tu-berlin.de To: cocoon-dev Subject: Re: cocoon and non XML content ... (was Jackson... five?) In-Reply-To: <3E22BFAA.7040008@saxess.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N On Mon, 13 Jan 2003, SAXESS - Hussayn Dabbous wrote: > Oh, sorry for my question... > > Is it possibly the TextParser generator, i am looking for ? > Could this parser also handle "unstructured text" as follows: > > "Take out a peace of data from the input, replace it by > something else and finally make all of the stuff a valid > XML output..." > > > regards, > hussayn > > SAXESS - Hussayn Dabbous wrote: > > Hy; > > > > I struggled over following problem and wonder, if this is relevant > > and has been solved within cocoon: > > > > assume, you have some content, that is plain text, e.g. log reports. > > Now you want to use this text with cocoon. Naturaly you have to > > convert the text to XML. This could be done by writing a new > > generator of course, which would be specific to the data, it has > > to convert. > > > > Now assume, you have many different sources, that have to be > > transformed into XML. > > > > Wouldn't it be nice to have a generator at hand, that could be > > controlled via configuration? By this i can use one generator, > > then configure the conversion rules as needed, get the XML data > > out of it, then proceed within cocoon pipelines ... > > > > > > One possible use case (sounds like beeing a JTidy task, but it isn't): > > > > i have several servers, that produce very dirty HTML, intermixed with > > javascript. My generator shall gather data from these sites and > > not only convert html to xhtml, but also do some necessary modifications > > within the javascript, which is certainly not a suitable task for XSLT > > processing, nor for JTidy. i could think of regexp processing here... > > > > Rather than creating dedicated generators for every site, i want one > > generator, that can be configured to convert data dependent on the > > url, or whatever... I think, this is just another step towards > > real content syndication ... > > > > What do you mean? > > Any thoughts are welcome ... The next version of the chaperon components will have an text generator included, which will likely be design as a XMLizer. The version will also have a lexical scanner included, which use pattern similar to regex to tokenize the text. If you don't have structured text. This LexicalTransformer can be use for example in syntax highlighting. This version will be finished in the next days, so staty tuned. Stephan Michels. --------------------------------------------------------------------- To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org For additional commands, email: cocoon-dev-help@xml.apache.org