creadur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Burrell Donkin <>
Subject [RAT] Pipelines...
Date Mon, 05 Aug 2013 14:11:09 GMT
Essentially, Rat is simple.

A source (perhaps a file system or a compressed archive) is walked, 
producing documents. Each document (perhaps a file in a file system, or 
a resources in an archive) flows through a pipeline - a series of 
processing steps, enriching with various meta-data. An end point 
collates the data.

It seems to me that the current code fails to express this


At the moment, IDocumentAnalyser[1] is implemented by most steps in the 
pipeline (and other stuff too), wired together in a potentially flexible 
fashion. This now seems over-engineered to me.

I think a concrete Pipeline would be more obvious, with controlled 
extension points at each step of the processing.



View raw message