Apache Allura™ / Tickets / #4186 Import from mediawiki into wiki tool

Dave Brondsema - 2012-05-10

labels: --> import

milestone: limbo --> forge-backlog
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Description has changed:

Diff:

--- old 
+++ new 
@@ -1,5 +1,5 @@
 We should have a script to import from a mediawiki database dump into a discussion tool.  Maybe use sqlite to load the database dump?  But also have an option to connect to a live database, since we'll be able to take advantage of that in some situations.

-See scripts/teamforge-import.py for reference on how to create wiki pages programmatically.
+See scripts/teamforge-import.py for reference on how to create wiki pages programmatically.  That script uses 2 phases, first to extract the data and save it, second to load the data into Allura.  We should do that, but make the data files be free from phpbb details, so the loading script can be generic.  Some day people can write extractors from other wiki software and use the same loading script.

 This code should be within the ForgeWiki tool, and ideally exposed as a paster command.

Description has changed:

Diff:

--- old 
+++ new 
@@ -3,3 +3,6 @@
 See scripts/teamforge-import.py for reference on how to create wiki pages programmatically.  That script uses 2 phases, first to extract the data and save it, second to load the data into Allura.  We should do that, but make the data files be free from phpbb details, so the loading script can be generic.  Some day people can write extractors from other wiki software and use the same loading script.

 This code should be within the ForgeWiki tool, and ideally exposed as a paster command.
+
+The import should handle as much as is possible: wiki pages (including converting the format to markdown - not sure if there is any existing conversion libraries available), attachments, history of each page (our wiki pages are versioned too), permissions, "Talk" pages (can go into the discussion for a page - not sure the best way to split up into separate comments), other config options, etc.
+

Dave Brondsema - 2012-05-18

labels: import --> import, 42cc
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-05-21

I've created the following tickets:
- #57: [#4186] Convert mediawiki to markdown (3cp)
- #58: [#4186] Create basic paster command skeleton to import from mediawiki (2cp)
- #59: [#4186] Import pages (1cp)
- #60: [#4186] Import history of each page (1cp)
- #61: [#4186] Import attachments (1cp)
- #62: [#4186] Import talk pages (1cp)
- #63: [#4186] Import refactoring and optimization (2cp)
Total: 11cp

Related

Tickets: ~~#4186~~

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-05-22

Originally by: tramadolmen

Can i use for conversion mediawiki to markdown third party package https://github.com/zikzakmedia/python-mediawiki?

In requirements-common.txt will be something like that:
-e git+https://github.com/zikzakmedia/python-mediawiki.git...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-05-22

Yes, you can use that package for the conversion. Our deployment process requires that libraries be packaged up, so the github reference won't actually work for us. For now can you just manually install python-mediawiki? And then when we're done, we can make a package of it for our deployment.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-05-22

Looks like https://github.com/erikrose/mediawiki-parser might be another option. The python-mediawiki is GPL and I'm not sure what license mediawiki-parser uses. It would be preferable not to depend on a GPL licensed package, but if it is the only option, its okay for this.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-05-23

Originally by: tramadolmen

mediawiki-parser depends on pijnu. pijnu have GPL licence and can't install without patching setup.py file.

Last edit: Anonymous 2015-07-08

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-05-28

closed #57

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-05-29

Looking good so far. I'm not exactly clear why you need bbcode, but if it works, great :)

I would recommend adding a test specifically for mediawiki formatting, rather than bbcode though. For example:

+ mw_formatting = "'''bold''' ''italics''" + mw_output = converters.mediawiki2markdown(mw_formatting) + assert "**bold** _italics_" in mw_output
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-05-30

Originally by: tramadolmen

I have added bbcode because in ticket you say:

We should do that, but make the data files be free from phpbb details, so the loading script can be generic.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-05-30

Oh, my fault :( I made a very similar ticket for migrating phpbb, and then copied it for this mediawiki ticket. I meant to say "make the data files be free from mediawiki details, so the loading script can be generic."

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-05-30

closed #58

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2012-06-04

Nike Free Run+ Mens odiamynwoj chaussures nike free run reduction jhevlncoqml jordan 3 online sale aivdgoiva cheap jordan 10 retro afxvwkggc jordan 10 ktoiyncnoik jordan 4 outlet ttfupwtjx online jordan 1 ahwkembex chaussures nike free run vente wwqnmm cheap jordan retro online
http://awfvsozlukhaber.sozlukhaber.com/h/9023/

Last edit: Anonymous 2016-12-24

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-06-04

We need the file itself attached to the wiki page.

And then any links or references (e.g. image tags) to be be converted to appropriate Markdown references. Here's how you link to an attachment [link text here](PageNameHere/attachment/FileNameHere.pdf) and to use an attachment as an image inline, do: [[img src=attached-image.jpg alt=foobar]]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-06-05

closed #59

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-06-05

closed #60

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-06-05

closed #62

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-06-06

closed #61, changes are in 42cc_4186

We have one ticket left regarding this feature - #63 [#4186] Import refactoring and optimization.

Please review our changes or give us some test data to check the whole import procedure

status: open --> code-review

Related

Tickets: ~~#4186~~
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-06-11

status: code-review --> open
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-06-11

Sent test data in email.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-06-13

closed #63 and pushed changes to 42cc_4186

After a complete review we decided to re-write some parts of the code, and it was already almost done in #63 but we need to finish a few new tickets:
- #85: [#4186] Import/export talk, attachments and history (2cp)
- #86: [#4186] Extract data from sqlite (1cp)

At the moment it works with a MySQL mediawiki database directly and imports pages only. Here's how we run it:

$ paster wiki2markdown ../Allura/development.ini -d dump_dir -n Projects -p test -s mysql --user root --password qwerty --db_name mediawiki_test

status: open --> in-progress

Related

Tickets: ~~#4186~~
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dave Brondsema - 2012-06-14

If loading the msyql dump into sqlite is not real easy, I think we have options for reading from mysql directly. I can imagine there are syntax differences which would be annoying to deal with (and perhaps a lot of work to truly parse & translate correctly)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yaroslav Luzin - 2012-06-25

ok, that's easier, closed #86: [#4186] Extract data from sqlite

Related

Tickets: ~~#4186~~

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Apache Allura™

Forge software for hosting software projects

Milestone

Searches

Help

#4186 Import from mediawiki into wiki tool

Related

Discussion

Related

Related

Related

Related