Create an importer for wiki pages for Google Code projects. The importer should follow the framework discussed on the mailing list and integrate with the project importer from [#6456].
The list of pages will have to be parsed out from the list page (e.g., https://code.google.com/p/support/w/list with a jQuery selector of #resultstable tbody a
). The page contents and comments will need to be extracted from the elements with an id
of wikicontents
and commentlist
, respectively.
Support for a user-mapping might be nice, but is unlikely to be terribly useful, so comments should probably just be attributed to *anonymous
, perhaps with a link prepended to the comment that links to the Google Code user profile page. (Or maybe we should only import the page contents?)
The HTML to Markdown library is GPL so we'll want to put this in a separate library / package, and we should move the MediaWiki importer with it.
Also, there is some example code parsing the wiki data here
The example code only parses out Featured Wiki links from the project summary, not the actual wiki content.
Diff:
Related
Tickets:
#6456forge:tv/6458
googlecodewikiimporter:master
href="/p/modwsgi/wiki/InstallationInstructions"
which just happens to still work since the URL structure matches, but fails to work if the project or mount point is changed.Changes on:
forge:tv/6458
googlecodewikiimporter:tv/6458