#1767 Split up long import JSON by chunks in allura_import.py

v1.0.0
closed
Tracker
nobody
2015-08-20
2011-03-21
No

Offshot of [#1618]. Post tickets in limited chunks to import API, in case import JSON is very big. As suggested by Christoph Zwerschke of TG2 project.

Related

Tickets: #1618
Tickets: #1768
Tickets: #1861

Discussion

  • Paul Sokolovsky - 2011-04-01

    Per [#1861], chunk size will be 1. This provides best protection against protocol request limits and timeouts, and appears to streamline import too (by properly scheduling background tasks).

    • milestone: apr-14 --> apr-7
     

    Related

    Tickets: #1861

  • Paul Sokolovsky - 2011-04-02

    Some stats:

    $ time python allura_import_test.py data/sf100.json -u https://sf-psokolovsky-3025.sb.sf.net -a tckff494ed3e68e8dafdf09 -s bc5799e658b48433ee0461da88921c9c62f3871cc6a2fa7ab612228a36fc4e062bf524bc88d7af00

    real 4m55.725s

    $time python allura_import_test.py data/sf1000.json -u https://sf-psokolovsky-3025.sb.sf.net -a tckff494ed3e68e8dafdf09 -s bc5799e658b48433ee0461da88921c9c62f3871cc6a2fa7ab612228a36fc4e062bf524bc88d7af00

    real 51m29.992s

    So, when run against sandbox, the import time is back to originally seen 1K/hr, unlike localdev improvement as reported in [#1861]. I wonder, if this has to do with amqp processing (which is now off on localdev, but as before in use on sandboxes).

     

    Related

    Tickets: #1861

  • Paul Sokolovsky - 2011-04-03

    Ok, this is done now, in ps/1767. Expected result is that import (e.g. allura_import_test.py) with hundreds and thousands of tickets should work now, whereas previously it timed out (against sandbox).

    • status: open --> code-review
    • assigned_to: Paul Sokolovsky --> Dave Brondsema
     
  • Dave Brondsema

    Dave Brondsema - 2011-04-05

    The enumerate() function would be cleaner here:

    +    cnt = 0
    +    for ticket_in in tickets_in:
    +        cnt += 1
    

    Is scripts/allura_api.py for just the import API? If so, let's rename the file and the class to reflect that. So it is not confused with the "regular" oauth API. Hmm... or is it using our old pre-oauth signing? That is planned to be removed.

    ea52be296656a226bbee4f93b08c2443e24c790f could be improved by logging the exception when a retry happens. We've had trouble before with not knowing about exceptions because they were silenced with code similar to this.

    Lastly, what would an appropriate test be? Running allura_import_test.py I assume. What about allura_import.py? Anything I need to know to run that? BTW, we could put allura_import_test.py into the forge-classic integration tests, which run against full operational sandboxes.

    • status: code-review --> in-progress
    • assigned_to: Dave Brondsema --> Paul Sokolovsky
     
  • Paul Sokolovsky - 2011-04-06
    • enumerate() - done

    • allura_api.py - well, as I told previously, I was considering to make a single class to access all types of APIs we provide. So, I wonder to factor out ticket API access, remember that idea, and named if allura_api.py instead of allura_import_api.py . But looking at the evolution of API access, I guess it doesn't make much sense to have such single - it's ok to have general OAuth API client and special-purpose client(s) for restricted API(s). So, renamed to allura_import_api.py , and I made changes to [#1650].

    • Auto-retry on network error - added logging

    For the tests, as we discussed, I'll move allura_import_test.py test (and corpus for it) to forge-classic, and look into make it fully automatic.

     

    Related

    Tickets: #1650

  • Dave Brondsema

    Dave Brondsema - 2011-04-07
    • status: in-progress --> code-review
    • assigned_to: Paul Sokolovsky --> Dave Brondsema
     
  • Paul Sokolovsky - 2011-04-07

    Test moved to forge-classic, made automatic. I also did some steps towards integration tests to be runnable on standalone Allura (e.g. localdev). The main culprit there is different set of users in Allura and small dataset, we should either bring them together, or use get_site_admin() and friends like proposed in my code.

    ps/1767 in both forge and forge-classic

     
  • Paul Sokolovsky - 2011-04-07

    Btw, as inter-repo move was required, I merged some changes against #1768 into ps/1767 too.

     
  • Dave Brondsema

    Dave Brondsema - 2011-04-07
    • status: code-review --> closed
     

Log in to post a comment.