We should have a script to import from a mediawiki database dump into a discussion tool. Maybe use sqlite to load the database dump? But also have an option to connect to a live database, since we'll be able to take advantage of that in some situations.
See scripts/teamforge-import.py for reference on how to create wiki pages programmatically. That script uses 2 phases, first to extract the data and save it, second to load the data into Allura. We should do that, but make the data files be free from phpbb details, so the loading script can be generic. Some day people can write extractors from other wiki software and use the same loading script.
This code should be within the ForgeWiki tool, and ideally exposed as a paster command.
The import should handle as much as is possible: wiki pages (including converting the format to markdown - not sure if there is any existing conversion libraries available), attachments, history of each page (our wiki pages are versioned too), permissions, "Talk" pages (can go into the discussion for a page - not sure the best way to split up into separate comments), other config options, etc.
Diff:
Diff:
I've created the following tickets:
- #57: [#4186] Convert mediawiki to markdown (3cp)
- #58: [#4186] Create basic paster command skeleton to import from mediawiki (2cp)
- #59: [#4186] Import pages (1cp)
- #60: [#4186] Import history of each page (1cp)
- #61: [#4186] Import attachments (1cp)
- #62: [#4186] Import talk pages (1cp)
- #63: [#4186] Import refactoring and optimization (2cp)
Total: 11cp
Related
Tickets:
#4186Originally by: tramadolmen
Can i use for conversion mediawiki to markdown third party package https://github.com/zikzakmedia/python-mediawiki?
In requirements-common.txt will be something like that:
-e git+https://github.com/zikzakmedia/python-mediawiki.git...
Yes, you can use that package for the conversion. Our deployment process requires that libraries be packaged up, so the github reference won't actually work for us. For now can you just manually install python-mediawiki? And then when we're done, we can make a package of it for our deployment.
Looks like https://github.com/erikrose/mediawiki-parser might be another option. The python-mediawiki is GPL and I'm not sure what license mediawiki-parser uses. It would be preferable not to depend on a GPL licensed package, but if it is the only option, its okay for this.
Originally by: tramadolmen
mediawiki-parser depends on pijnu. pijnu have GPL licence and can't install without patching setup.py file.
Last edit: Anonymous 2015-07-08
closed #57
Looking good so far. I'm not exactly clear why you need bbcode, but if it works, great :)
I would recommend adding a test specifically for mediawiki formatting, rather than bbcode though. For example:
Originally by: tramadolmen
I have added bbcode because in ticket you say:
Oh, my fault :( I made a very similar ticket for migrating phpbb, and then copied it for this mediawiki ticket. I meant to say "make the data files be free from mediawiki details, so the loading script can be generic."
closed #58
Nike Free Run+ Mens odiamynwoj chaussures nike free run reduction jhevlncoqml jordan 3 online sale aivdgoiva cheap jordan 10 retro afxvwkggc jordan 10 ktoiyncnoik jordan 4 outlet ttfupwtjx online jordan 1 ahwkembex chaussures nike free run vente wwqnmm cheap jordan retro online
http://awfvsozlukhaber.sozlukhaber.com/h/9023/
Last edit: Anonymous 2016-12-24
We need the file itself attached to the wiki page.
And then any links or references (e.g. image tags) to be be converted to appropriate Markdown references. Here's how you link to an attachment
[link text here](PageNameHere/attachment/FileNameHere.pdf)
and to use an attachment as an image inline, do:[[img src=attached-image.jpg alt=foobar]]
closed #59
closed #60
closed #62
closed #61, changes are in 42cc_4186
We have one ticket left regarding this feature - #63 [#4186] Import refactoring and optimization.
Please review our changes or give us some test data to check the whole import procedure
Related
Tickets:
#4186Sent test data in email.
closed #63 and pushed changes to 42cc_4186
After a complete review we decided to re-write some parts of the code, and it was already almost done in #63 but we need to finish a few new tickets:
- #85: [#4186] Import/export talk, attachments and history (2cp)
- #86: [#4186] Extract data from sqlite (1cp)
At the moment it works with a MySQL mediawiki database directly and imports pages only. Here's how we run it:
$ paster wiki2markdown ../Allura/development.ini -d dump_dir -n Projects -p test -s mysql --user root --password qwerty --db_name mediawiki_test
Related
Tickets:
#4186If loading the msyql dump into sqlite is not real easy, I think we have options for reading from mysql directly. I can imagine there are syntax differences which would be annoying to deal with (and perhaps a lot of work to truly parse & translate correctly)
ok, that's easier, closed #86: [#4186] Extract data from sqlite
Related
Tickets:
#4186