xmlgraphics-fop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Xmlgraphics-fop Wiki] Trivial Update of "GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingBeforeFloats" by VincentHennebert
Date Fri, 04 Aug 2006 10:46:02 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Xmlgraphics-fop Wiki" for change notification.

The following page has been changed by VincentHennebert:
http://wiki.apache.org/xmlgraphics-fop/GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingBeforeFloats

The comment on the change is:
Splitting: 2- Implementing Before-floats

New page:
#pragma section-numbers on

'''Contents'''
[[TableOfContents]]


== Characteristics of the fo:float element ==
This section contains a summary of the part of the spec dealing with floats.

||'''Nb of generated areas'''||'''Area class'''||'''Notes'''||
||<|2>0 or 1 ||<|2(>xsl-anchor||inline area of dimension 0 if possible, block
area otherwise||
||only if the value of the "float" property is not "none"||
||<|6> 1 or more of||<|4>xsl-before-float||must be a descendant of a flow object
assigned to a region-body||
||may not be a descendant of an absolutely-positioned block-container||
||must appear on the same or a following page||
||may be broken on several pages only if it can't fit on a page alone (without any other float,
footnote, or normal content)||
||xsl-side-float||generates reference areas||
||xsl-normal|| ||

 * Validity checks: an fo:float may not have an fo:float, fo:footnote or fo:marker as a descendant.
There are several objects which have such constraints (fo:title, fo:footnote...) but AFAICT
the checks for those constraints are not implemented. I'll leave it as is for now, as it is
not critical. A general solution will have to be found when implementing such checks.

== Factorizing out the Handling of Footnotes and Floats ==
I see only two differences between before-floats and footnotes:
 * footnotes must appear on the same page as their citation, unless there really is no possible
pagination which achieve that. Figures may appear on later pages.
 * footnotes may be split so that a part be placed on the following page. Figures may not
be split.

Those two differences excepted, the handling is the same. So layoutmgr.!PageBreakingAlgorithm
could be adapted to no longer handle a list of footnotes, but two (or more) lists of floats;
the float machinery could be extracted from !PageBreakingAlgorithm and put in a special parameterized
class. In fact the two parameters could just be penalties for deferring and splitting:
 ||'''Kind of float'''||'''Defer penalty'''||'''Split penalty'''||
 ||footnote||almost infinite||very much||
 ||before-float||much||infinite||
(Actually a before-float may be split, but only in the degenerated case where it does not
fit alone on a whole page.)

Other possibility: only one parameter defer penalty, and an overriden getFloatSplit method,
which would contain the code of the current getFootnoteSplit method for footnotes, and just
return 0 for before-floats.

=== Changes on the LayoutManager Architecture ===
When the "float" property is "none", the float must be handled as a normal block; no anchor
area is generated. To handle this case I've chosen to directly create a !FloatBodyLayoutManager
which will render the float in the flow of elements. Otherwise I mimic the behaviour of footnotes:
a !FloatLayoutManager is created which will insert an anchor in the list of Knuth elements;
the corresponding float blocks will be handled by !FloatBodyLayoutManager. This is done in
!LayoutManagerMapping.!FloatLayoutManagerMaker, where the value of the "float" property for
the corresponding Float node is consulted before creating the !LayoutManager.

There are probably things to factorize out between the two !LayoutManagers; the {{{addAreas}}}
method is for example the same. It may make sense to create an abstract !OutOfLineLayoutManager
super-class. However, the {{{addAreas}}} method seems to never be called, so it may perhaps
be removed and it would become useless to have a common super-class. That's a thing I must
find out, this is on my TODO-list.

=== A Special Class for out-of-line Objects ===
First, there are many classes in the layoutmgr package which are related to the Knuth breaking
algorithm. As the layoutmgr package already contains a lot of classes, it may make sense to
create a new subpackage for the breaking algorithm. That's what I did in the patch, and if
this is agreed I'll move the other classes in this subpackage in my next patch.

The !OutOfLineRecord class is meant to contain all the logic related to the handling of out-of-line
objects:
 * storing progress informations during the breaking process: how many out-of-lines have already
been encountered, how many have already been placed, was the last placed object split, etc.
The corresponding variables have exactly the same role as the totalWidth, totalStretch, totalShrink
variables.
 * methods to manipulate out-of-line objects: register newly encountered ones, find a place
where to split, etc.

The progress informations are stored in an internal class of !OutOfLineRecord; it is used
for two things:
 * to record the current situation during the breaking, when a legal breakpoint is being considered;
 * when an active node is created, to record infos about the out-of-line objects inserted
up to the corresponding (feasible) breakpoint.

Why an internal class?
 * as the progress informations are also used by active nodes, this is better to group them
in one class rather than having several independant fields. Hence a class.
 * they are one part of the informations stored in an !OutOfLineRecord instance. The other
informations are the list of Knuth sequences corresponding to out-of-line objects, the list
of cumulated lengths, the size of the separator, and so on. Hence an internal class, part
1.
 * they are accessed very often by methods of !OutOfLineRecord. This is a convenient way to
have access to the fields, while keeping them private for other external classes. Hence an
internal class, part 2.
 * as already said, they are also used by active nodes, and not only by an !OutOfLineRecord
instance. Hence a static class.

=== Other Changes ===
They mostly consist of copy-pasting code relating to footnotes wherever they are referred
to, and adapt it to floats. Examples: adding anchors for before-floats in !KnuthBlockBox,
adding a !FloatLayoutManagerMaker in !LayoutManagerMapping, handling the addition of a before-float
area when necessary, etc.

== Algorithm for Placing Before-Floats ==
In Fop, out-of-line objects are handled by an extension of the Knuth breaking algorithm. The
handling of before-floats is a bit simpler because they can't be split on several pages like
footnotes (excepted in the degenerated case where a float does not fit on one page alone).

Ideally, a footnote should be entirely placed on the same page as its citation. When this
is not possible, it may be split, but as few times as possible. See the following figure to
understand the issue (the line with the small red sign contains the footnote citation):

http://atvaark.dyndns.org/~vincent/footnotes.png

In the first case, the footnote is split on two pages and that's the best we can do. In the
second case, there are pieces of the footnote up to 3 pages later; this would disturb the
reader who would have too many pages to turn to read the footnote.

To avoid that, the algorithm prevents a footnote to be split if there is a legal breakpoint
between the currently considered active node and the currently considered breakpoint, unless
this is a new footnote (i.e., not already split). For example, on the preceding figure, every
line corresponds to a legal breakpoint. When the line containing the footnote citation is
considered for breaking the page, the new footnote may be split. When the following line is
considered, there are already many legal breakpoints between the breakpoint of the previous
page and that one, so the footnote is not allowed to be split. So the algorithm tries to put
the entire footnote on the page, which does not work as it is too big. Thus the breakpoint
is discarded (this is not a ''feasible'' breakpoint), and same for the following lines.
For the first page, the best breakpoint then corresponds to the line with the footnote citation,
this allows to put as much of the footnote as possible on this page.

On the second page, no break will be permitted if it splits the footnote, for the same reason
as before. Thus the best breakpoint will be the one which puts as many normal lines as possible
on the page, plus the entire remaining piece of footnote.

As before-floats may not be split, their handling is simpler than for footnotes. Actually
we may use the same algorithm, but this will force the float to be on the same page as its
citation, which may give underfull pages as on the following figure:

http://atvaark.dyndns.org/~vincent/floats-underfull.png

It would be better to put the citation on the first page together with some other lines and
defer the float on the second page.

In fact, just playing with increased demerits for breakpoints with deferred floats is sufficient
to have a reasonable amount of floats on the same page as their citations, while preventing
underfull pages from being created.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-commits-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-commits-help@xmlgraphics.apache.org


Mime
View raw message