xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thomas.dewe...@kodak.com
Subject Re: Unicode compliant Line Breaking
Date Tue, 01 Nov 2005 11:27:41 GMT
Hi Manuel,

Manuel Mall <mm@arcus.com.au> wrote on 11/01/2005 04:24:05 AM:

> On Tue, 1 Nov 2005 01:33 am, thomas.deweese@kodak.com wrote:
> >         Just an FYI, Batik also currently has an implementation of
> > the Unicode TR14 word breaking alg.
> > (org.apache.batik.gvt.flow.TextLineBreak).

> Thomas, thanks for the pointer (Note to myself - need to become more 
> aware of what's in the Batik code base. Feeble excuse - Joerg didn't 
> seem to know either).

    It's a fairly recent addition, to support proposals for flowing 
text in SVG 1.2.

> Had a look at the Batik code: Same algorithm as Joerg wrote (not 
> surprising as UAX#14 actually contains real C code) very similar data 
> structures internally. Data structures are hard coded and not generated 
> from the Unicode text files. 

   I would not think it would be worth the while to parse the Unicode
files on startup every time (they aren't small).  Passing in the table
mapping chars to types might be a useful extension (but in honesty
I doubt .5% of users would ever provide their own, unless the code
only included say Western Language by default).

> The API is different, especially it relies 
> on Batik specific types being passed across not just plain Strings (but 
> this could probably be handled by a wrapper).

   AttributedString (the type passed across the interface) is a
JDK class: java.text.AttributedString.  We do define now attributes
(keys) to hang the word break info off of.

> This probably strengthens the argument of making all of this part of 
> XMLGraphics Common....grumble...grumble...

   Yes, this is mostly why I mentioned it.  On the other hand the
code is not that large or really overly complex.

> My main reason for hesitation with the XMLGraphics Common approach is 
> simple man power. We need to setup the infrastructure (subversion, 
> mailing lists, web site, etc.). We need to maintain this. 

   Sure, some of this will happen anyway because of the current 
problems we have with the PDFTranscoder (Batik depends on FOP which
depends on Batik :( ).  Those dependencies need to be straightened
out.

> We would basically would publish APIs currently internal to Batik 
> and FOP with all the resultant support headaches. For example, 
> I would not like to see my time diluted in the moment by having 
> to discuss API needs outside of FOP/Batik. 

   Yes, this is the big issue, as soon as an API becomes public
it is a lot more work to maintain it. 

> Actually I am reluctant to even dive into the Batik code base 
> in the moment. FOP is complicated enough to digest.

   The hope is that by exposing some of these API's we will 
attract some people as contributors that would otherwise be 
'scared off' by the size and complexity of the FOP and
Batik code bases.


Mime
View raw message