lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <>
Subject [jira] [Commented] (SOLR-10296) Convert existing Ref Guide and post-conversion cleanup
Date Tue, 09 May 2017 01:41:04 GMT


Hoss Man commented on SOLR-10296:

bq. links should all be fixed - so we should be free and clear to bulk remove the OLD_CONFLUENCE_ID
comments in all the source files

I realized that in order to safely remove those we need to sanity check that no links were
trying to use them -- which means SOLR-10640 really needs to be done.

I started working on that and realized there are actually 54 "broken" links -- either because
of this or because of ids that explicitly exist in one page, but implicitly exist in a diff
page of the PDF (because of the page shortname).

I'm going to work on cleaning all of those up before pushing my current work.

> Convert existing Ref Guide and post-conversion cleanup
> ------------------------------------------------------
>                 Key: SOLR-10296
>                 URL:
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: documentation
>            Reporter: Cassandra Targett
> We have developed several tools and scripts for converting the Ref Guide out of Confluence
which get us most of the way to a fully converted set of pages. However, we already know that
there are several issues that could not be automated.
> From, we have this list:
> * The conversion process will insert TODOs for several items that we thought might be
problematic during conversion; these need to be reviewed and resolved. Some of these items
are also covered in the below topics.
> * Block elements in tables. The current version of the PDF creation tool we are using
does not handle those properly (see
In some cases, we should remove the table entirely and present the content in a new way (using,
most often, [labled lists|] instead).
> * Review and (usually) remove huge Tables of Contents from the top of pages. The current
design of the online version will automatically create a TOC for the page, we don't need another
one and in some cases this TOC was hand-created so can't be removed via conversion.
> * Non-image attachments. Some SVG files will be converted to images, but they should
not be treated as images.
> * Failed link conversions. Despite my best attempts, many dummy URLs are treated by Confluence
as real URLs (meaning, dummy URLs like {{http://<host>:<port>/solr}} are coded
in Confluence's XHTML with <a> tags). These will be converted as URLs but will throw
errors during the conversion process. In some cases, the URLs aren't just these example URLs
but are indicative of a real problem that needs to be resolved.
> * Spurious <br/> tags. Some API pages have a list of available calls structured
as a list but without being a real ordered or unordered list. These will convert badly. The
issue has a list of pages where
this might be a problem.
> * Appropriate Lead Paragraphs. The stylesheet for HTML pages will make the first paragraph
of every HTML page a slightly larger font, by way of introduction. In many cases, the first
paragraph is not really ready for that sort of treatment and should be revised to be a more
succinct introduction to the feature or further contents of the page.
> More problems may be added to this issue as items that specifically need to be cleaned

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message