#1029 Trac import spike

v1.0.0
closed
sf-8 (45)
General
nobody
2015-08-20
2010-10-11
Mark Ramm
No

We need some sort of trac import system that we can use to import the support tickets into the forge project.

Other external projects that use trac like Django, Pylons, Turbogears, ought to be able to use the same system, which runs against a customer facing api.

  • Ticket import API
  • Wiki import API (with history)?

Ultimately we will want to be able to use the same API to import tickets, etc from sf.net, GitHub, BitBucket, GoogleCode, etc.

There was an effort to be able to get exported data into a standardized format called ForgePlucker, and we should see if any of that can be used.

Discussion

  • Paul Sokolovsky - 2010-10-13
    • labels: -->
    • status: open --> in-progress
    • custom_field__size: 4 --> 4
     
  • Paul Sokolovsky - 2010-10-13

    To support import API on allura's side, we just need to provide comprehensive and unconstrained artifact/item create API. By unconstrained I mean ability to create tracker ticket in closed state, for example. API should of course provide ability to add comments/discussion for individual items. And well, it should allow to create such comments with arbitrary dates. Here comes possibility for some abuse, so maybe we indeed should externally structure APIs by normal vs migration ones, as well as set special flag for artifacts created with migration API.

     
  • Mark Ramm - 2010-10-13

    The internal API implementation should share as much as possible, but I agree migration API usage should be a separate permission that we set, and normal users should not have it.

    If that means having separate URL's that's fine, if it means having the same URL's with different permission levels, that would be fine with me as well.

    Also, we should create some marker that shows that it was "automatically created by a migration script."

    There's also an issue related to external username to internal user mappings, which ought to be in the configuration for the migration script or whatever.

     
  • Paul Sokolovsky - 2010-10-19

    I had a detailed look at ForgePlucker as well as the whole area of "client-side" bugtracker export and formats.

    Well, ForgePlucker project consists of lot of belles-lettres, but doesn't really have formal spec and code which can be called "reference". The closest thing to spec is http://home.gna.org/forgeplucker/forge-ontology.html , but it's informal, rather than formal. There're of course good reasons for that - having the same concept, forges largely vary in detail, so they say themselves they should support few well enough (for capture) before producing common data scheme.

    With that, I wanted to have a look at the data actually generated, but there're no real samples provided, and it turned to be a pain to get the ForgePlucker app (bitrot, strange ways of web-scraping, etc.)

    Well, at the end I got samples on which we cab base as the intermediate format.

    I also looked around. Here's page "Open Source Bug management tools"
    https://picoforge.int-evry.fr/cgi-bin/twiki/view/Helios_wp3/Web/OpenSourceBugManagementTools
    There're tools which export/scrape bugtrackers, but none explicitly talks about portable format for representation.

    http://coclico-project.org has an aim of provide standards for forge interoperatibility, but there doesn't seem to be available deliverables, at least in English.

     
  • Paul Sokolovsky - 2010-10-19

    So, based on above export options research, as well as initial import prototype, I have following proposal for migration service:

    1. As we'd need to patch/updesign existing item creation API to handle migration, we can indeed as well provide separate migration API, which will with bulk data, working with ForgePlucker-like representation.

    2. Migration API access granted with limited-time (oe maybe even one-time) API key, produced on demand.

    3. Migrations are explicitly tracked with a Migration object, containing data like user requesting migration, date, etc. All specific tool artifacts created by migration service has internal field "migration_id" pointing to a migration. The field may be not visible in user UI, but provides internal accountability.

    4. It makes sense to provide migration validation service, which will check user's migration data for correctness. Then we indeed could provide one-time migration keys.

     
  • Paul Sokolovsky - 2010-10-27

    Ok, figuring ins and outs of ORM to handle setting some fields took some time, but stuff shapes out well.

    Note that "migration" is ambiguous - it's both Allura's DB upgrade, and import from 3rd party tools. So, I either use "3rd-party migration"; or "import", if it's Allura-centric (like in the API implementation).

    In branch ps/1029, trac export and allura import scripts live in /migrate-3rdparty/ . They are pretty well scripts, and might live under scripts/, but maybe we'll want to make a separate package with setup.py out of it (at least because there're 3rd party dependencies), so left it at the top-level dir.

    Example of Trac export:

    python trac_export.py https://sourceforge.net/apps/trac/sourceforge --limit=5 >trac.json

    Example of Allura import:

    python allura_import.py --base-url=http://localhost:8080 -a <API key=""> -s <Skey> trac.json

    Then go directly to http://localhost:8080/p/test/bugs/1 (list display problem is still there).

     
  • Paul Sokolovsky - 2010-10-27

    Current TODOs:

    • Figure out ticket list display problem - now, when all custom fields are registered properly, tickets show up in list
    • Handle keywords (easy) - 3 way supported
    • Handle Trac fields not available in Allura as custom fields
    • Handle attachments - need to decide on size limit/external attachments support (see below)
    • Allow to import with ticket# preservation into an empty tracker - ticket ids are now preserved whenever possible by default
    • Elaborate validate_import call (as well as error handling in perform_import call)
    • Access control for perform_import
    • Elaborate JSON format to be more of project data than just ticket list (currently just ticket subset of ForgePlucker format) - now skeleton of complete ForgePlucker format (starting from project root) is expected.
    • From above, possibly make separate tool for controlling migration (but with external per-tool handlers still, apparently)
     
  • Paul Sokolovsky - 2010-11-06

    Attachments are now exported from Trac (--attachments required) and imported into Allura. They are being downloaded online (within REST request processing) from the reference URL (original Trac instance in our case). A size limit is defined, exceeding which causes an attachment to be skipped (import validation check should be added warning a user beforehand of such issue).

    And what if we provide support for "external" attachments, where instead of storing attachment body, we store a URL to it?

     
  • Paul Sokolovsky - 2010-11-15
    • status: in-progress --> validation
    • custom_field__size: 4 --> 4
     
  • Paul Sokolovsky - 2010-11-15

    Rick's review comments:

    • when encountering unknown users, I think we should not create placeholder users; those user objects will be forever orphanned. We should instead use the anonymous user and add a note to the object being imported's text with the imported user name. (See some of the old allura tickets to get a feel for what I mean -- anything created by sfx-overlords)
    • we shouldn't use if_missing={} anywhere (that's the default anyway, and you shouldn't use mutable defaults) -- if you really want if_missing={}, do if_missing=lambda:{} (that way you get a clean dict each time)
    • I tend not to like one-liner methods (get_custom_fields, etc) particularly when they're only manipulating public attributes. Unless there's a big performance hit, a @property is also preferable to a get_* method.

    The API key UI will actually be implemented by sfx, so we'll need to coordinate with jwh on that.

     
  • Mark Ramm - 2010-11-15
    • status: validation --> closed
     
  • Paul Sokolovsky - 2010-11-15
    • status: closed --> validation
    • custom_field__milestone: nov-15 --> nov-22
     
  • Paul Sokolovsky - 2010-11-15

    Sorry, this isn't finished yet - need to address review comments and merge at least (though there're more issues, like sandboxes not working fo REST API for me).

     
  • Paul Sokolovsky - 2010-12-15
    • status: validation --> closed
     
  • Rick Copeland - 2010-12-16
    • custom_field__milestone: backlog --> dec-13
     

Log in to post a comment.