Wikis are git repositories and can be accessed like git clone https://github.com/OpenRefine/OpenRefine.wiki
for example. Check the main repo API first to see if the repo has wiki enabled. You can see https://sourceforge.net/p/googlecodewikiimporter/git/ for reference as an example of another wiki importer. It is a separate repo because it needs the "html2text" package to convert html to markdown, and that is a GPL library.
Github supports many markup types. Find a full list and determine what the best way to convert them to markdown is. My guess is that few formats will have tools available to convert them directly to markdown, so my likely recommendation would be to render them as HTML (using pypeline as a generic way to handle many of those formats) and then html2text to get it into markdown.
If html2text or any other GPL library is needed, this will have to be a separate repo from the main Allura repo. So please evaluate & test the conversion options first, before putting code into place.
A second phase to all this (i.e. do it separately, after the basic import is all working) would be to handle revision history. This would mean going through each commit in the wiki git repo, and converting & updating every file that changes. This may be very time consuming, so when we get to it, we may want it to be a checkbox option, so users only do it if they want it.
Closed #439, #442.
All changes in
je/42cc_6534
I've rebased this and pushed it to branch
db/6534
. I resolved several conflicts, so please start further work off of that branch. All specific examples below are from https://github.com/mxcl/homebrew/wiki/delete()
method on the model. Even though it's just one line, it shouldn't be repeated in multiple places.name_and_ext = filename.split('.', 1)
should be changed to useos.path.splitext
(or at leastrsplit
). It causes some pages not to be handled right: "Not a wiki page Homebrew-0.9.3.md. Skipping"html2text.BODY_WIDTH = 0
)Niche Stuff <a name="Niche_Stuff"></a>
*[[this checklist|Troubleshooting]]*
doesn't convert right (Home.textile)Related
Tickets:
#6622Created:
After #449 is done you'll be able to merge this and then we can work on textile-related issues in #450.
Related
Tickets:
#6534Closed #449. Force-pushed
je/42cc_6534
You can review fixes now. Textile related issues we'll address in a #450 later.
Merged je/42cc_6534 to master.
Closed #450.
je/42cc_6534_textile_fix
It lost on all markups (except markdown, which we're not converting at all), not
just on textile. It caused by html2text, if we need to fix it, than it could be
separate ticket, since we need to fix it for all markups. It may require changes
in html2text. Though, I didn't investigate it further, so don't sure.
If you take a look at the source of textile page, you'll see that
<a>
isactually in the source, so it's not a bug of a converter. https://github.com/mxcl/homebrew/wiki/Acceptable-Formulae/_edit
Stack trace from an attempted mxcl/homebrew import: https://sourceforge.net/p/allura/pastebin/525c61379095476d244795e1
Looks like this commit tried to create 2 wiki pages with the same title (since extension is dropped).
Created [#6758] separately for table support. Guess its not easy.
Related
Tickets: #6758
Merged je/42cc_6534_textile_fix. Going to leave open for the above issue Tim posted, and then we'll be done :)
Created #460: [#6534] GH wiki importer: duplicate key error (1cp) for that
Related
Tickets:
#6534Closed #460.
je/42cc_6534_last_fix
Basically, there was the issue with multiple deletions for the same file in consecutive commits. It happens if there are sequence of commits like "delete file -> revert -> delete the same file", or sequence of commits with file renames, which change only file extension, like "Page.md -> Page.textile -> Page.md".
taskd
processes such commits within one second, so we end up creating pages with the same title (like 'Page 12:12:12 ...'). Fixed that by adding microseconds to deleted pages' titles.Also added some logic to handle page renames better.