xmlgraphics-fop-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Xmlgraphics-fop Wiki] Update of "GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingBeforeFloats" by VincentHennebert
Date Mon, 14 Aug 2006 12:20:31 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Xmlgraphics-fop Wiki" for change notification.

The following page has been changed by VincentHennebert:
http://wiki.apache.org/xmlgraphics-fop/GoogleSummerOfCode2006/FloatsImplementationProgress/ImplementingBeforeFloats

The comment on the change is:
Proposing an improved algorithm for placing out-of-line objects

------------------------------------------------------------------------------
  
  In fact, just playing with increased demerits for breakpoints with deferred floats is sufficient
to have a reasonable amount of floats on the same page as their citations, while preventing
underfull pages from being created.
  
+ = Improving the Algorithm =
+ == Rationale ==
+ ''β€œIn the Pagination World, if the worst can happen, it will happen.”'' β€” vh
+ 
+ The existing algorithm for footnotes provided a strong and helpful basis for the implementation
of before-floats. However, before-floats have specificities of which me may take further advantage;
among others:
+  * they are likely to have more shrinkability/stretchability than footnotes;
+  * it isn't critical to defer them one page.
+  * it's best to have a full page with a deferred float rather than an underfull page with
no deferred float
+ 
+ The three following examples show one small limitation regarding the footnote handling,
along with limitations regarding before-floats. The proposed algorithm below should be able
to deal with those limitations.
+ 
+ '''Footnote'''
+ 
+ As explained above, a line is not considered to be a feasible breakpoint if there is a previously
encountered footnote which must be split. This allows to put as much of the footnote as possible
on the current page. Excepted in the following case:
+ 
+ ||http://atvaark.dyndns.org/~vincent/footnote_no-break-between_ok-1.png||http://atvaark.dyndns.org/~vincent/footnote_no-break-between_ok-2.png||
+ 
+ Here the block in the footnote containing the text "Unbreakable block" actually is breakable.
If we now add the property keep-together="always", the result becomes:
+ 
+ ||http://atvaark.dyndns.org/~vincent/footnote_no-break-between_ko-1.png||http://atvaark.dyndns.org/~vincent/footnote_no-break-between_ko-2.png||http://atvaark.dyndns.org/~vincent/footnote_no-break-between_ko-3.png||
+ 
+ What's happening is the following: if the line containing the footnote citation is placed
on the first page, there is no way to put the footnote in an acceptable manner: putting just
the first three lines before the unbreakable text leads to an underfull page; putting the
whole unbreakable block leads to an overfull page. The former possibility could be a solution:
a too-short node would be created to represent this page break. But the actually chosen too-short
node is preferred because it contains no footnote split, so its demerits are lower.
+ 
+ The solution would be to split the footnote before the unbreakable block, and to put on
the first page one or two more lines of normal text after the line containing the footnote
citation. But as already said, the algorithm doesn't allow this possibility.
+ 
+ 
+ ''' Before-float 1'''
+ 
+ Let's consider the following fo document: {{{
+ <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
+   <fo:layout-master-set>
+     <fo:simple-page-master master-name="default"
+       page-width="12cm" page-height="5.3cm">
+       <fo:region-body/>
+     </fo:simple-page-master>
+   </fo:layout-master-set>
+   <fo:page-sequence master-reference="default">
+     <fo:flow flow-name="xsl-region-body"
+ 	     space-after.minimum="2pt"
+ 	     space-after.optimum="6pt"
+ 	     space-after.maximum="14pt"
+ 	     widows="1" orphans="1">
+       <fo:block space-after="inherit">
+ 	This is a block with a float. This is a block with a float.
+ 	The float anchor is just behind this <fo:inline color="blue">word</fo:inline><fo:float
float="before" color="blue">
+ 	  <fo:block>
+ 	    This is the float content. This is the float content.
+ 	    This is the float content. This is the float content.
+ 	    This is the float content. This is the float content.
+ 	    This is the float content.
+ 	  </fo:block>
+ 	</fo:float>.
+ 	This is a block with a float. This is a block with a float.
+ 	This is a block with a float. This is a block with a float.
+       </fo:block>
+       <fo:block space-after="inherit">
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+       </fo:block>
+       <fo:block space-after="inherit">
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+       </fo:block>
+       <fo:block space-after="inherit">
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+       </fo:block>
+     </fo:flow>
+   </fo:page-sequence>
+ </fo:root>
+ }}}
+ 
+ The rendering is the following:
+ 
+ ||http://atvaark.dyndns.org/~vincent/before-float_main-should-shrink_ok-1.png||http://atvaark.dyndns.org/~vincent/before-float_main-should-shrink_ok-2.png||
+ 
+ Now if we set the space-after.optimum value to "8pt", the result becomes:
+ 
+ ||http://atvaark.dyndns.org/~vincent/before-float_main-should-shrink_ko-1.png||http://atvaark.dyndns.org/~vincent/before-float_main-should-shrink_ko-2.png||
+ 
+ What's going on? When a before-float is being considered, the length of the main text already
placed on the page is its ''optimum'' length, not its minimal length. If there is not enough
room for the float, the algorithm doesn't even try to shrink the normal content. Thus the
float is deferred to the following page, which might really disturb the user who sees that,
obviously, there would be room to put the float on the first page.
+ 
+ 
+ ''' Before-float 2'''
+ 
+ The problem is the same if the float is allowed to shrink but not the content: {{{
+ <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
+   <fo:layout-master-set>
+     <fo:simple-page-master master-name="default"
+       page-width="12cm" page-height="5.25cm">
+       <fo:region-body/>
+     </fo:simple-page-master>
+   </fo:layout-master-set>
+   <fo:page-sequence master-reference="default">
+     <fo:flow flow-name="xsl-region-body" widows="1" orphans="1">
+       <fo:block>
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+       </fo:block>
+       <fo:block>
+ 	This is a block with a float. This is a block with a float.
+ 	The float anchor is just behind this <fo:inline color="blue">word</fo:inline><fo:float
float="before" color="blue">
+ 	  <fo:block space-after.minimum="3pt"
+ 		    space-after.optimum="4pt"
+ 		    space-after.maximum="9">
+ 	    This is the float content. This is the float content.
+ 	    This is the float content. This is the float content.
+ 	  </fo:block>
+ 	  <fo:block>
+ 	    This is the float content. This is the float content.
+ 	    This is the float content. This is the float content.
+ 	  </fo:block>
+ 	</fo:float>.
+ 	This is a block with a float. This is a block with a float.
+ 	This is a block with a float.
+       </fo:block>
+       <fo:block>
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+ 	This is a block without a float. This is a block without a float.
+       </fo:block>
+     </fo:flow>
+   </fo:page-sequence>
+ </fo:root>
+ }}}
+ 
+ The result is the following:
+ 
+ ||http://atvaark.dyndns.org/~vincent/before-float_float-should-shrink_ok-1.png||http://atvaark.dyndns.org/~vincent/before-float_float-should-shrink_ok-2.png||
+ 
+ If we set space-after.optimum to "6pt", this becomes:
+ 
+ ||http://atvaark.dyndns.org/~vincent/before-float_float-should-shrink_ko-1.png||http://atvaark.dyndns.org/~vincent/before-float_float-should-shrink_ko-2.png||http://atvaark.dyndns.org/~vincent/before-float_float-should-shrink_ko-3.png||
+ 
+ Here there is an additional interesting issue, which was in part already shown in the footnote
case: the first page break corresponds to a too-short node; but this node is still preferred
to a node containing additional lines, as among them there would be the line containing the
reference to the deferred float, which results to higher demerits for the node. Even if the
aesthetic result actually is better.
+ 
+ 
+ == Proposed Algorithm ==
+ This algorithm is mainly derived from the algorithm presented in the paper "Pagination Reconsidered".
+ 
+ === Allowing Underfull Pages ===
+ At the paragraph level, underfull lines in justified text are unacceptable, as they create
a ragged margin which catches the eye. That's why too-short nodes are handled separately from
the other normal nodes and used only as a fallback.
+ 
+ But at the page-sequence level, where the "display-align" property has a default value of
"before", underfull pages can be even unnoticeable. And pages are more likely to be underfull
than lines as they may contain big paragraphs with no possible stretching.
+ 
+ That's why underfull pages should be acceptable to a certain degree, which ideally would
be configurable. Let's say 90%. The algorithm would then accept as feasible the breaks which
would lead to a more than 90% full page. Thus in the third example above, the first page break
would have a few more lines and would correspond to a feasible break, leading to a slighty
underfull page. Even if there would be a deferred float, the fact that this would be a feasible
break would make it preferable to the last registered too-short node which is currently chosen.
+ 
+ Of course we may take the amount of underfullness into account when computing a page's demerits,
so that full pages are given the preference.
+ 
+ === Allowing Out-of-line-only Pages ===
+ We could imagine that a document has so many figures that it is sometimes necessary to create
a page entirely made of figures, to "flush" the figure list. So the algorithm should be able
to handle such cases.
+ 
+ In section 6.10.1.3 of the Recommendation, there is the following sentence: "There may be
limits on how much space [...] areas [for out-of-line objects] can borrow from the region-reference-area.
It is left to the user agent to decide these limits." Ideally there would be a configuration
setting telling which ratio of the page should be filled with normal content; if this ratio
is null then pages only made of out-of-line objects would be allowed.
+ 
+ How to handle that? Each time a normal active node is registered, we consider the currently
deferred out-of-line objects. If it is possible to make a feasible page, we register a new
active node for the corresponding feasible break.
+ 
+ This implies to have two imbricated {{{for}}} loops iterating over before-floats and footnotes,
and to create an active node each time some combination of out-of-lines is possible. This
will eat some additional memory and processing time, but this would be deactivatable (set
a non-null minimum ratio of normal content), and this feature would be dedicated to complex
book-like documents where pagination quality is more important than processing time.
+ 
+ Actually those loops may also be desirable when considering a "normal" page break. Because,
for example, splitting a footnote two times might help solve an otherwise impossible pagination
problem in later pages.
+ 
+ === Taking Advantage of Out-of-lines shrinking/stretching ===
+ The current algorithm does not consider the possibility of shrinking/stretching an out-of-line
object to make it fit on the current page. It wouldn't be too difficult to implement this
possibility.
+ 
+ === Refining the Demerits Computation for Deferred Out-of-line Objects ===
+ A given value is added to the demerits of a feasible break if there are dangling references.
The idea would be to multiply this value by the page difference between the reference and
the out-of-line. Same for split footnotes: we would multiply the corresponding demerits by
the number of times the footnote is split.
+ 
+ This should allow to avoid the special treatment for footnotes (the {{{noBreakBetween}}}
method) which doesn't work well in the situation above. And generally a pagination which would
split a footnote too much would have so high demerits that it should never be preferred.
+ 
+ As a consequence, there may be more considered feasible breaks than currently, but again
we are in the case of book-like documents where we can afford the additional memory consumption.
+ 

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-commits-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-commits-help@xmlgraphics.apache.org


Mime
View raw message