incubator-allura-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Brondsema" <brond...@users.sf.net>
Subject [allura:tickets] Re: #6534 Wiki importer for github
Date Mon, 16 Sep 2013 15:27:57 GMT
Using searches just on README files, to get a ballpark on popularity (e.g. <https://github.com/search?q=path%3AREADME.asciidoc&type=Code&ref=searchresults>)
I get:

* asciidoc: 1000
* org: 3750
* pod: 2200
* rdoc: 135,000
* mediawiki: 800

Rdoc is definitely popular and mediawiki is not.  However, since we have an easy approach
for mediawiki (which may be more popular for wikis than readmes) let's just do mediawiki.
 Go with the `mediawiki2markdown` approach.  Remember that depends on optional GPL libraries
so keep this conversion optional too.

Let's leave the rest for later, we'll see what demand is for them.

If it's possible to list the supported formats on the import form's description text, that
would be great.

Which reminds me, a separate issue is that we need an individual tool importer for github
wiki.  That is, specifically, a `GitHubWikiImportController` set on the importer's `controller`
attribute.


---

** [tickets:#6534] Wiki importer for github**

**Status:** in-progress
**Labels:** import github 42cc 
**Created:** Wed Aug 07, 2013 09:54 PM UTC by Dave Brondsema
**Last Updated:** Mon Sep 16, 2013 10:55 AM UTC
**Owner:** nobody

Wikis are git repositories and can be accessed like `git clone https://github.com/OpenRefine/OpenRefine.wiki`
for example.  Check the main repo API first to see if the repo has wiki enabled.  You can
see https://sourceforge.net/p/googlecodewikiimporter/git/ for reference as an example of another
wiki importer.  It is a separate repo because it needs the "html2text" package to convert
html to markdown, and that is a GPL library.

Github supports many markup types.  Find a full list and determine what the best way to convert
them to markdown is.  My guess is that few formats will have tools available to convert them
directly to markdown, so my likely recommendation would be to render them as HTML (using [pypeline](http://pypeline.sourceforge.net/)
as a generic way to handle many of those formats) and then html2text to get it into markdown.

If html2text or any other GPL library is needed, this will have to be a separate repo from
the main Allura repo.  So please evaluate & test the conversion options first, before
putting code into place.

A second phase to all this (i.e. do it separately, after the basic import is all working)
would be to handle revision history.  This would mean going through each commit in the wiki
git repo, and converting & updating every file that changes.  This may be very time consuming,
so when we get to it, we may want it to be a checkbox option, so users only do it if they
want it.


---

Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed to https://sourceforge.net/p/allura/tickets/

To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/allura/admin/tickets/options.
 Or, if this is a mailing list, you can unsubscribe from the mailing list.
Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message