cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Michels <step...@apache.org>
Subject Re: cocoon and non XML content ... (was Jackson... five?)
Date Mon, 13 Jan 2003 15:02:31 GMT


On Mon, 13 Jan 2003, SAXESS - Hussayn Dabbous wrote:

> Oh, sorry for my question...
>
> Is it possibly the TextParser generator, i am looking for ?
> Could this parser also handle "unstructured text" as follows:
>
> "Take out a peace of data from the input, replace it by
> something else and finally make all of the stuff a valid
> XML output..."
>
>
> regards,
> hussayn
>
> SAXESS - Hussayn Dabbous wrote:
> > Hy;
> >
> > I struggled over following problem and wonder, if this is relevant
> > and has been solved within cocoon:
> >
> > assume, you have some content, that is plain text, e.g. log reports.
> > Now you want to use this text with cocoon. Naturaly you have to
> > convert the text to XML. This could be done by writing a new
> > generator of course, which would be specific to the data, it has
> > to convert.
> >
> > Now assume, you have many different sources, that have to be
> > transformed into XML.
> >
> > Wouldn't it be nice to have a generator at hand, that could be
> > controlled via configuration? By this i can use one generator,
> > then configure the conversion rules as needed, get the XML data
> > out of it, then proceed within cocoon pipelines ...
> >
> >
> > One possible use case (sounds like beeing a JTidy task, but it isn't):
> >
> > i have several servers, that produce very dirty HTML, intermixed with
> > javascript. My generator shall gather data from these sites and
> > not only convert html to xhtml, but also do some necessary modifications
> > within the javascript, which is certainly not a suitable task for XSLT
> > processing, nor for JTidy. i could think of regexp processing here...
> >
> > Rather than creating dedicated generators for every site, i want one
> > generator, that can be configured to convert data dependent on the
> > url, or whatever... I think, this is just another step towards
> > real content syndication ...
> >
> > What do you mean?
> > Any thoughts are welcome ...

The next version of the chaperon components will have an text generator
included, which will likely be design as a XMLizer.
The version will also have a lexical scanner included, which use pattern
similar to regex to tokenize the text. If you don't have structured
text. This LexicalTransformer can be use for example in syntax
highlighting.

This version will be finished in the next days, so staty tuned.

Stephan Michels.


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message